mtu Yara for 9 gm : 
‘ x = 
piacere arenas 
St Cnn tty 


apogee ee “iy 


Tete ieee nape 
ove ote WOH: 
sreiamaie toon rate anny ter 


ie See Ss 4 
BOS TON 
UAITVERSITY 


Caer 


Tit 


His ba Peses lt 


nw: STRATION 


BOSTON UNIVERSITY 
College of Business Administration 


THESIS 


The Testing Program for the Social-Business Subjects 
- of the Secondary School 


by 
HOWARD CLYDE TRACY 


submitted in partial fulfillment of 
the requirements for the degree of 


MASTER OF COMMERCIAL SCIENCE 


‘ . 1935 
« 
Ms 
Ms 
be 
M4 \ 
PN ae am A 
: 2 
Mts, . 
ee Se ay Se Lee i 
; Y ih es wy AM ia 
Ww fae eA Ui . * I 
a a : : 
mt : hk sf ae cs v ¥ ee aay 
= ee oa : ‘ 


a 
4 
4 
— 
4 


. 


~ 
ir) 


No professional man, then, thinks of giving according 
to measure. Once engaged, he gives his best, gives his 
personal interest, himself. His heart is in his work, and 
‘for this no equivalent is possible; what is accepted is in 
the nature of a fee, gratuity, or consideration, which en- 
ables him who receives it to maintain a certain expected 
mode of life. The real payment is the work itself, this 
and the chance to join with other members of the profession 
in guiding and enlarging the sphere of its activities. - 
GEORGE HERBERT PALMER in The Ideal Teacher. 


A. BETRODUC TIONS care e ee SER EM crews wb nee dasenee 


CONTENTS 


pats Valle ae @ ook 


Newness of the Modern Test Movement 

Unfamiliarity of Many Teachers with Modern Testing 
Procedures 
New-Type Tests Aid with the Problem of Handling 
Larger Classes 

Practical Significance to the Principals and 
Supervisors of the Secondary School 


Be HISTORICAL SUMMARY OF THE TESTING MOVEMENT... 2.20020 026 


Il. 


Iil. 


IV. 


Ancient Period 

Medieval Period 

The Boston Examination of 1845 
Development of Intelligence Tests 
Development of Standardized Tests 


SOCIAL-BUSINESS SUBJECTS OF THE SECONDARY SCHOOL.27 


Status of These Subjects 
Need for Social-Business Subjects 
Need for Reorganization of the Content 


CRITERIA OF A GOOD EXAMINATION... ..ccccceccccvecend 


Validity 
ae What It Is 
b. Principal Methods for Validating a Test 


Ge Further Suggestions 


Reliability 

ae What It Is 

be. Principal Methods for Obtaining Reliability 
Ge Factors Influencing Reliability in a Test 
de Reliability of the New-Type Examination 
Objectivity 

ae What It Is 

be. The Need for Objectivity in Tests 

Other Desirabie Characteristics 

a. Comprehensiveness 

b. Facility 

ce. Utility 

de Rapport 


Digitized by the Internet Archive 
in 2017 with funding from 
Boston Library Consortium Member Libraries 


https://archive.org/details/testingorogramfo0Otrac 


CONTENTS 


E. STANDARDIZED TESTS versus INFORMAL, THACHER-MADE 


CR OS eC ie ee a ee ee errs > Fe 


I. 


Il. 


Main Differences 

ae Demonstrated Validity 

be. Demonstrated Reliability 

c. Differences in Degree of Objectivity 
d. Provision for Norms or Standards 
Conclusions 


F. FUNCTIONS or Po sete PO de dos Waeee 8S wae Cae ee ¢ OG 


I. 
II. 


Ill, 


Vi. 


Vil. 


Testing Retention of Information 

ae Check-up on Units of Work Studied 

Determination of Achievement Status 

a. Necessity for Determining Complete Accomplishment 
in a Subject 

be. Check-up versus Inventory Tests 

Ge Some Pre-Tests Are Really Inventory Tests 

Stimulation of Daily Work 

a. Motivation Values 

be. Pupils Learn Best When They Have Knowledge of the 
Progress They Are Making 

Motivation of Reviews 

ae Necessity For Frequent Reviews 

b. Values Derived from Preparation for an Examination 

Provision of Objective Standards 

ae Inadequacy of Subjective Standards 

b. Objective Standards Involve Some Kind of hieasure- 
ment 

Ge Bad Features of Standardized Tests as Objectives 

Measurement of Teaching Efficiency 

ae Unfairness of Old Impressionistic Methods for 
Rating Teachers 

b. General Desirability of Teacher Rating 
1. Student-test Method for Measuring Teaching 

Efficiency Most Promising 
2 Factors That Would Prevent This Method From 
Operating Fairly 

Improvement of Teaching Eificiency 

a. Need for Constant Teacher Improvement 

be Building a Test Requires Teacher to be Thoroughly 
Conversant with Aims and Objectives of the Course 

ce Test Results Afford the Teacher Opportunity to 
Check on Relative Efficiencies of Different 
Instructional Methods 


_ FARROW averev- e4 TPaz 


fA ees? 


— PGE OTE VEL! 


ee 


Vill. 


IX. 


CON TANTS 


d. The Use of Objective Tests for Supervision of 
Study and for Individual Teaching 

e. Necessity for Supervision of Testing 

Diagnosis of Special Difficulties 

ae Effective Teaching Requires the Diagnosis of 
Special Difficulties in Learning 

be Educational Diagnosis Concerned with Both 
Individual and Group Diagnosis 

c. Interrelations of Survey and Diagnostic Tests 

d. Diagnosis Requires Knowledge of the Physical 
Basis of Learning 

@. Remedial Work in the Social Studies Difficult 

Cultivation of Intellectual Powers 

a. Examinations Leave Individuals Taking Them with 
Intellectual Habits of Wide Use. 

be. Many Professions and Vocations Require the Daily 
Passing of Oral Examinations 


Ge PROCEDURES FOR DRAFTING NEW-TYPE THSTS...ccceceveee ell 


Section l. 


I. 
Il. 
Iil. 
IV. 


General Procedure 

Drawing Up a Table of Specifications 

Drafting the Items in Preliminary Form 

Deciding Upon the Scope 

Editing and Selecting the Final Items 

Rating the Items for Difficulty 

Breaking the Items Into Alternative Forms 

Rearranging the Items in Order of Difficulty 

Preparing the Instructions for the Test 

Making the Answer Keys or Stencils 

Deciding Upon Rules for Scoring 

2 Methods for Drafting Specific Parts of the 
Combination Test 

The True-False Test 

The Multiple-Choice Test 

The Matching Test 

The Completion Test 


H. SUMMARY OF EXISTING PUBLISHED TESTS AND EXAMINATIONS 
IN THE SOCIAL=-BUSINESS Sr gaia otter a wb 0 € 6 ole wae wed 


I. METHODS OF CHANGING TEST SCORES INTO GRADES.........152 


I. 


The Marking System 
ae Need for Marks 


ig 


Be 


oie 
On; 


lace 


w 
% 
ey se 


au 


LS se v|P CU 
. 


CONTENTS 


be Necessity for Having all the Teachers in a 
School Adhere to the Same System 
II. Absolute versus Relative Marking Systems 
ae The Percentage System 
1. Fallacies 
b. Systems Based on the Normal Curve 
l. The Missouri System 
III. Methods for Changing Test Scores Into Grades 
ae Proportional Method 
be Sigma Method 
1. Ungrouped Scores 
2. Grouped Scores 
3. The Morrison Marking System 
4. The Percentile Ranking System 


de PROBLEM OF ABILITY Pee a igre <a ura wa easels v0 oe e LOG 


I, What Ability Grouping Is 
ae The Present Need 
II, How Classification May Be Carried Out 
III. Advantages 
IV. Disadvantages 


K, THE GROUP INTELLIGENCE TEST AS AN AID TO THE COMMERCIAL 
i 2. e on en gaa e et habe ce bhwecesocat ee cewoec Ll OS 


I. Development of the Group Test 
II. Common Types of Material in Group Tests 
III. Limitations of Our Present Intelligence Tests 
ae Validity 
be. Accuracy 
I¥V. Practical Values of Intelligence Tests 
Ve. Group Intelligence Tests Suitable for Use in 
Secondary Schools 


L. ae wea ay a ak yee en cece cewesaanceeseeceseltG 


I. In General 
II. Specific Findings Summarized 


A. INTRODUCTION 

The beginnings of the examination idea are lost in the 
hoary past. History does not provide us with the exact 
date when this concept originated, although students of 
education do know that even the most primitive peoples 
required their young men to undergo various examinations to 
determine physical fitness, proficiency in the art of war- 
fare, and ability to procure food. How different the 
examinations of today are from these early prototypes: 
These primitive peoples with their tribal organization 
reserved the examinations of the eligible young men to a 
time when elaborate ceremonies could be held. At the 
Glimax of the festivities the candidates publicly sub- 
mitted to their examinations. Passing the examination 
entitled the young warriors to full membership in the 
tribal organization. ‘these ceremonial periods were eager- 
ly anticipated by all because of the feasting and 
Merry-making that accompanied them. 

The gradual changes in the examination concept will 
be treated more fully later in this report. Let it 
suffice at this point to indicate that an evolutionary 
process went on over a period of thousands of years. 


Until a quarter of a century ago educators attempted to 


aes, wh - 
fant .Wwent 


Soave! 


«tend bexia aay 


_ 


nar sapoad 


’ BOD BE 


measure entirely by two means, viz.e, informal oral testing 
and the written traditional examination. Ina series of 
scientific studies, it was proved that the traditional or 
essay-type examination was not satisfactory as a measuring 
instrument because of its unreliability due to limited 
Sampling and subjectivity in scoring. Just prior to this 
time, certain great educational authorities were experi- 
Menting with rating scales and tests in the fields of 
spelling, arithmetic, handwriting, and composition with 
the intent of removing the subjective factor in teacher 
marking. Coincident with this epochal development was the 
work of the French genius, Alfred Binet, in the field of 
psychological testing. These wonderful advances caused the 
attention of progressive educators throughout the world 

to be focused on tests and measurements. 

At the present time it is safe to say that there are 
hundreds of published tests purporting to be scientific 
measuring instruments. Not only that, but each year new 
books and treatises are published to swell the ever- 
increasing mass of knowledge about testing methods and 
techniques. The research student is amazed at the great 
mass Of material that exists in this field. Notwithstand- 
ing, it is sate to say that thousands of high school teach- 
ers in the United States are ignorant of the principles 


and procedures underlying the well-rounded testing program 


vl 


OF ‘ow * “ed 


. bg sag 
§ wil 


* , er 
yorq ehh ci-jyp fisis euRan 


atl eee 
ae 


one exood 


iy ht e 


sateserent 


; ris OV ins i. 
Oo" LLER Oe 


that is so necessary for effective teaching. All too 
often the classroom teacher has been content to administer 
published tests just to show his supervisors that he is 
progressive. In many cases the main values of the test 
would be lost because no provision was made for diagnosis 
and remedial teaching. What should impress the educational 
expert in many high schools is that there is no definite, 
co-ordinated testing program. In commenting about this 
Situation, Blackstone mades the following observation: 
"Each teacher gave examinations and short tests according 
to any haphazard sort of plan that suited his fancy, and 
there was rarely any attempt to perfect these devices, to 
make sure that they measured fairly and accurately, or to 
extend their scope outside the narrow realm of measuring 
factual knowledge and an understanding of some 
In order to reap the full benefits of twentieth century 
education the individual teacher should work out a sound 
testing program and follow it, rather than give tests 
Whenever it suits his convenience. 

An equally potent reason why teachers are turning 
more and more to the new-type examination is because of 
the necessity born of depression conditions. "Large 
Organizations of taxpayers everywhere are insistent that 


the costs of government must be reduced. Since a large 


(1) Blackstone, Earl G., Com. Hdu. in the High School, 
General Editor, Harry Kitson, Ginn & Co., 1929. 


fe 


fanartacubs edt pessqul dbinzosea gate 


ee ork 


oot fLd nesindsot ) onsen 2 x0 Ys E 


agteinimta oF tied dob need pnd oe 


4 : aoe re - ne . 
at ac teas cxpbivisgis in: weie Ot ot, 


i, Fe 


seet end te sepfev p tenons Seams. cam 


16 


aciekieb of si éxed? teds et aloedes dana nae & 


pint chase gailineomoorn! He THOTY pritest: bets “ 
-soitayresdo tiiwelle? em? @ebam oustedoslé , ao! 
yarbrogra test tieis Pas eagtestusxs RTOE: vosoges) 
it awefet sid betige fads aake io t108@ > 


+ saotveb saoct Fophagt Og Be say anes 


» eabotonise to sobtnesicoiae a8, be 
Witusase, dteistnews 36 we ‘ftene¢ thst ‘ate peak ot 
Eeaioa> e dus show bkeode refoast Lecsbtvibat eds nolie 

etes: exin nedd  vedtes {Ft wolies bas aatgotd ¢ 


eons ine vnoe cin evinaete 15% 


eo 


ental”. seta bed Poti o wet onerqeb io. nites. nee 
sido Insteienl ete oveneyrere ere ceqxed ‘So. pes es oa 
ontet a genl® sbecrber- dd tonm. mane je 


~~ anne ne ant nares Sart nase eens 


~ ae ol 


, Looree Ae li- ens ‘pi. “pm: ‘208. 
Peer eds Bako Pmvcine! 


part of local taxes goes to the support of education, 
the public schools of America are facing a critical 
period of economic adjustment. Reductions in school 
budgets are being met by larger classes and increased 
teaching ee ie movement has not been confined 
to merely a few high schools; in all parts of the 
country, the same trend is evident. Teachers and 
educators may bitterly attack this tendency but it is 
continuing never-the-less. It is a physical impossi- 
bility for the ordinary teacher to quiz orally all his 
pupils daily if he has large classes. This dilemma 
would be unsolvable if it were not for new-type tests 
with their correcting and scoring methods. 

Supervisors and administrators are also utilizing 
the new-type examination to an increasing degree. 
Both supervision and testing look to the same goal, the 
improvement of teaching efficiency. Supervisors no 
longer have the temerity to step into a classroom and 
attempt to judge the teacher's ability according to the 
Old impressionistic method in ten or fifteen minutes. 
The supervisor is aided tremendously by the results of 


new=-type tests administered to the classes in the 


(2) Carlson, Paul A., The Measurement of Business 
Education, pp. 21, South-Western Publishing Co., 1932. 


eer 
to 
ik 


a 


toe 


oa fT toe 


2+ US 76 


offeaineisass a 


"Supervision based upon test results tends to be positive 
and constructive. It forms a scientific approach for 
helpful conferences, suggestions, and experimentation. 
Without the use of test results, supervision is a hit and 
miss procedure and is sometimes valueless, if not even 
a a ins the superintendent, standardized 
tests have meant nothing less than the ultimate changing 
of school administration from guess work to scientific 
Tae conclusion, then we may say that the new- 
type examination with its accompanying correcting and 
scoring procedures has been a material factor in 
alleviating the work of all parties concerned, principals, 


supervisors, heads of department and the classroom 


teacher. 


(3) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 291, Houghton-Mifflin Co., 1950. 


(4) Cubberly, Ellwood P., Public Education in the United 
States, Houghton-Mifflin Co., 1954. 


‘epee 


s By . 
fo: ean! dé is bess 


eta 


@ Wi we 


7 “ 
“need overt 
Ags S Bt? v2 Li 
Modan locos er 
- , ‘ Ze ; £ Fe 
ivi@ nek Teg i8xe ot 
Fie eer - 
~~“ 7 


Sst Boxe 


[2 tc B10F ane satin 


he tee - 


bSiy 22 


ie Hs 
wasgqeh to ekeed exootetny 


—— a en A Re Re ee 


B, HISTORICAL SUMMARY OF THE TESTING MOVEMENT 

The beginnings of examinations go back before the 
dawn of recorded history. It is true, however, that the 
early examination was very different from the modern 
product; a process of evolution has gone on culminating 
in the standardized objective and intelligence tests of 
today. Early history supplies us with a number of in- 
stances of examinations. In the Old Testament we read 
how the Gileadites tested the Ephraimites upon their 
ability to pronounce the word "Shibboleth." The unlucky 
Ephraimite who pronounced it "Sibboleth" failed in his 
examination and was speedily put to death. 

This tragic test is not, however, to be considered 
the first instance of examination. "As early as 2200 B.C., 
China had an elaborate national system of examinations, 
for the purpose of selecting public ete cabbd aah al 
and examinations of various kinds were in use hundreds 
and even thousands of years ago ary Arian people as the 
Chinese, the Greeks, and the Romans.” It is safe to say 
that tests of mental and physical traits that were in- 
volved in the initiation ceremonies of primitive people 
antedated even these first early attempts at examination. 


Even primitive peoples were not slow to recognize the 


(1) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 1, Houghton-Mifflin Co., 1930. 


(2) Odell, C. We, Educational Measurement in High School, 
pp. 55, The Century Co., 1930. 


CRITTER? 


oben lgexe ie pekee 
3 . ee 
Lo 


et sT 


beele v pinata eat 4 t 


ise ytoreid ‘Virsa 


SHS LSGH LENS to ger 
guC Oe C 


iwoRmeig. Gir ov: inte 


. 7 * 
tee i - 
2hAiD 4 vii ais 


aRgons neve 


ees sa0d' sal Roa 


i Ee 


necessity for the measurement-of achievement of pupils as 
an essential element of education. According to Alberty 
and Thayer: "It may be said that even under primitive. 
conditions, the measurement of achievement of pupils is 
an essential element of education. Before the youth may 
share fully in the life of the male members of the tribe, 
he has to receive certain elementary training at the 
hands of the women, the end of which is marked by rites 
and ceremonies which indicate the fitness of the individ- 
ual to participate fully in adult Ver Gc ena 

In ancient times the Egyptians had taken tremendous 
strides in perfecting tests for physical relationships. 
Russell telis of their accomplishments thus: “Early 
peoples, such as these Egyptians, developed certain 
measures to a high degree and were able to achieve 
astonishing results, both mechanically and scientifically, 
in the fields so developed. In other fields they were 
restricted. In surveying, in some of the mechanic arts, 
and in such astronomical observation as could be accomp- 
lished without the use of the telescope, these people were 
highly ae evidence does not prove, however, 


that the Egyptians made any proportionate development in 


achievement or any other kind of indirect testing. 


(3) Alberty, H. R. and Thayer, V. T., Supervision in the 
Secondary School, pp. 328, D. C. Heath & Co., 1931. 


(4) Russell, Charles, Standard Tests, pp. 4, Ginn & Co., 1930. 


° ec . rt 
2 fem inBgeae 


evieocs 8 sata 


ae os 
e Sat omen sad a5 eb ev 188 


Sa 
aoe. Meng 
m Pol ce 


efi ¥ amide 


. 


4 he as a 


gain?egys il 


‘44 
ae 


oY. PRS ceeoa 


te nee Yas *6 soomeve aye 


“ae <= © ined-y hay iste tpl eae 
: a 


Tena beet 
m. 


> G¥ pert 


In the early Grecian period, the Athenians and 
Spartans had attained considerable fame for their systems 
of living and education. The Spartan system intended 
primarily to ineuleate its youth with the ideals of 
physical perfection and martial courage, whereas the 
Athenian system aimed at a combination of both physical 
development and cultural attainment. The Athenian system 
recognized the individual; the Spartan, considered the 
individual only as an integral part of the state. Russell 
says, "In these two systems of living and of education are 
found the two contrasting elements of primitive education 
and cultural education. The Spartan education had been 
formalized and systematized until it had lost every semb- 
lance of individualism; the Athenian education had been 
freed until it became in the end almost completely individ- 
Dee a, Spartans tested ability to endure pain 
by conducting regular examinations, in the form of whippings, 
before the altar of Artemis ae a in his 
famous method of questioning submitted his pupils to 
searching questioning which really was a form of incessant 
examination. The examination concept had its genesis in 
antiquity and has slowly developed through the ages. 

The medieval period contains two influences that 


affected the growth of examinations, viz., chivalry and the 


(5) Ibid, pp. 16. 


(6) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 2, Houghton-Mifflin Co., 1930. 


evi staoiuont- of. 


es va tome nedaya- ned 
Brus Toemg qoreves 
: a al 
At Bis bbe eat bealage 7h 
i ages yiae ao bLv ki 


Ord ceeds et® 


A 
£64 12 oni Be oe 


bestismeteags Lda. » Boia 


tam avon 
sisi fe 


— me o@ bn 


sO 


i ated bow aah ‘ae ssh 
Laeaoedaae : 


medieval universities. Chivalry, or the medieval system 
of knighthood, had as its ideal of education the prepara- 
tion of the young page to take his place in the ranks of 
knighthood. His education consisted of two phases, which 
were: first, training in the art of warfare given him by 
the men-at-arms and knights, and secondly, training in the 
etiquette and ideals of knighthood, the phase of education 
which was entrusted to the ladies of the castle. Both 
elements of education were considered necessary before the 
candidate for knighthood could qualify as a full-fledged 
knight. The raising of the esquire to the estate of 
knighthood was embodied in appropriate ceremonies that 
really were the approximate of an examination. Russell 
makes the following comment: “Education in chivalry 
considered from the point of view of one destined to be- 
come a knight partook somewhat of the Spartan form of 
education, since the page was drilled in all types of 
exercises to fit him for iipneaa ons 

The medieval universities had a significant influence 
in determining the trend of examinations. The medieval 
university was simply a guild or association of persons 
who were interested in teaching. They were divided 
according to three levels of attainment, which were: 


apprentice, journeyman, and master. The completion of the 


(7) Russell, Charles, Standard Tests, pp. 17, Ginn & Co., 1930. 


ss Li Ss 
> e118 
ca Se hgh 


bis anis-s8-n0tt °f s 


* ‘« = 
i a fa 


shit Bae otvenpite 


ho - 
is, 6 


fsstine enw ‘de 


3 7 0g 
‘ ere 
‘O73 Be ¢ ad rielait 


hoods Aus ot tot otatth 


hiwoLflot sar seis 
g9 robian09, 


tind Lae 2 ome’ e 


a 

ee 3 
neslotexre 

ee 

ai. 


a of oy 
‘vo 5 p Si! ae nie 


antotore ote6 | 


‘ mr ae por 
ie. , raeryZentse 


a) re 
aoow95 =e neae &* « &@ eee & Oe <6 se ee om ae he 


, Ba rlzak$ vitesent BS , 
; Pu 


ee ae | 


4 4 
- . 


> ie : 
2 


lower level entitled the candidate to a baccalaureate 
degrees. Upon completion of the journeyman level, the 
candidate was entitled to the master's or doctor's degree 
if he succeeded in disputing and defending a thesis. "There 
are records of such examinations at the University of 
Bologna as early as A.D. 1419, and at the University of 
Paris by the end of the thirteenth RE ne broaden- 
ing of the examination concept due to the influence of 
medieval universities is of great import because this 
institution was the crucibdle that produced the written 
form of examination. According to Lang: “Probably the 
first written examination at a university was in 1702, when 
it was introduced at Cambridge, England.......... There 
seems to be no doubt but that the universities of the 
middle ages gave the examination system to our western 
ee ae 

The Boston examination of 1845 is an important land- 
mark in the development of tests and measurements in the 
United States. Students of the history of education 
attach great weight to this early examination for a number 
of reasons; primarily, because it was the first comprehen- 
Sive written examination to be administered in any school 
System of this country. The details of how the examination 


Came into being are interesting. The school committee was 


(8) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 3, Houghton-Mifflin Co., 1930. 


(8). Ibid, pp. 3. 


artes at 
Fo rcias -*> 


ee hae 
GC Mica 


to ine ode: we s 


oicn4 


eet nf 2eX6 edd ce 


of. 5. ae nets las. 


be p: poor csv af st 


ee 
F 


empowered to make an inspection of the schools each year. 
Part of this inspection consisted of an oral examination 
of the pupils. Year by year as the enrolment of the 
schools grew it became increasingly difficult for the 
school committee members to get around to all the schools 
and adequately test the pupils on the subject matter they 
had studied. Realizing that they faced a formidable, if 
not impossible, job, the school committee decided to test 
merely the highest class in each school. After a while, 
even this task began to be performed in a perfunctory way. 
In 1845, the school committee decided to delegate to 
a sub-committee the duty of visiting schools and testing 
the pupils on their intellectual attainments. This sub- 
committee attacked its task with great thoroughness and 
seriousness of purpose. It decided to give a written 
examination, so a series of questions were drawn up in the 
various subjects including astronomy, definitions, geo- 
graphy, grammar, history and natural philosophy. These 
questions were drafted very carefully and a conscious 
effort was made to arrange the questions in the various 
subjects in increasing order of difficulty. The committee 
attempted this procedure in order to include a few ques- 
tions that probably even the poorest pupils could answer, 


and a few that would probably be beyond the mental powers 


# " Dw a5 - ‘.. 
2a 120% WF zeet™ ertdias 


suposed oh vers af69 « oo 


se 


e<cdmeat eats feaoo Loo : 
4. ie i 


Bet skeknes oe . 


e ser 


— 


cf 


,O6iveae 


= 
~ 4 
\ 


rom ancl #eel 


of the best pupils. These questions were then printed, 
to be handed out to the pupils on the day of the examin- 
ation. 

The insight that these committee members had into the 
subject of testing is amazing. The committee planned the 
details for administering the examination with a great 
deal of care. ‘Yhey were careful that no copies of the 
examination got out. By eight o'clock in the morning they 
appeared unannounced at different schools; each one of the 
three examiners taking a different school. Boston had 
nineteen grammar schools at that time, and these tests 
were to be given only to the highest classes which included 
five hundred and thirty pupils. The committee member 
examining a class would first see that all books and 
reference materials were put away; next the pupils were 
seated far enough apart to prevent communication; then they 
were warned that only one hour was allowed for the examina- 
tion and they were not to spend time on handsome writing; 
after this, the printed question sheets were handed out and 
the pupils started. Promptly at the end of the hour the 
examination papers were collected and the examiner hurried 
On to the next school on his schedule. In that way four 
schools were finished by each examiner in the morning, and 


three more in the afternoon. ‘the next day, the committee 


Yow eanditasaz ? . 


ete 


sit no ellang ene 


sae 
to 


ae ee" ese 
Sir ONES SOV -£9 


~ t . » > fing H 
hi ten BLAIS 


yLow. yee. 


Seoutonnans be 


idst eresoinexe 409 


a ie ns 
aeey- “a: 


a | i Tee 
(amsety 1eeveagts 


bexiend ¢ 


: ae 
a 1 ’ 
tel? SentTew ete 
> i: | hos 

MM etl 


sq noltentuaxe 


4 


+. 
be Yu 


~ 


members took another subject and went through the same 
procedure. They continued this, until they had given 
examinations on all of the subjects previously listed. 

The answers to the questions were scored carefully, 
and the results tabulated and analyzed. In order to 
score the papers uniformly as possible, a set of rules 
were prepared covering doubtful points. These rules 
Were used as guides rather than being rigidly adhered to. 
The committee was very conservative in its comments upon 
the examination; nevertheless, the results were quite 
startling. “The inefficiency revealed by the survey was 
as great a surprise and disappointment to the school 
committee as many of our modern survey reports have 
proved to be to those who made phan mee completeness 
and thoroughness of this Boston examination places it 
among the most remarkable incidents in the history of 
education in the United stavease 

In 1837 Horace Mann had been appointed Secretary of 
the State Board of Education in Massachusetts. His 
appointment was made in the face of determined opposition 
from the group of thirty schoolmen who served as masters 
of the Boston Public Schools. These schoolmen were 
afraid that the new secretary would infringe upon their 


prerogatives. Mann would be classified today as a pro- 


(10) Caldwell, Otis W., and Courtis, Stuart A., Then and 
Now in Education 1845: 1923, pp. 7, World Book Co., 


(11) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 7, Houghton-Mifflin Co., 1930. 


13. 


1924. 


bot sindat as. Late 
Love Lo Lint a7 naaq: eda 
fatigoob salssvoe aici 


abing 


to Qasr ee 


- 


: 
Set 


PG 


oo suigserend i 


a 
= 
a 


te grots 


Lidus , naS908 


Tarooe wen sri ras 


Lo 6s binow tiie . eevldss 


ene aa <a ee ee) siecieieieabaieien 


atvO ,fiewhiad 
bcs mh wot 


2 nea a 


14. 


gressive educator. Naturally his ideas would clash with 
those of this conservative band of schoolmasters. "If 
Horace liann did not precipitate the controversy with the 
Boston schoolmasters, at least he welcomed their opposition 
as an opportunity to direct the attention of the wealthiest 
and most ambitious city in the state to conditions much 

in need of Oop! 1 in Boston examinations of 1845 made 

@ profound impression on Mann because he recognized the 
significance of the scientific method applied to education 
and hailed the report as the dawn of a new era, 

Among his other duties, Horace Mann edited the Common 
School Journal. In this periodical, he published copious 
extracts from the Report, and discussed it and the subject 
of examinations at length. In his comments, Mann points 
out why he thinks the written examination is far superior 
to its oral predecessor. His comments reflect the 
brilliance of his great educational statesmanship that 
later was to be such a cogent factor in the development 
and trend of the American philosophy of education. 

Mann's arguments, briefly summarized, were as follows: 

"1. It is impartial 
2 It is fairer to the scholars 
5- It is more thorough 


4. It prevents officious interference of the teacher 


(12) Caldwell, Otis W., and Courtis, Stuart A., Then and 
Now in Education 1845 : 1923, pp. 7, World Book Co., 1924. 


pa <a 


te 
: i. 


,erotesafootoe. to Snead saltevseenee. elds | 
. siz Agin yerevers nod, edt efetiqhosxe fon. 616. 


noitéBeqge riedd Soucolen. od tesel F2 i, 


/ snoldeizew eft to colénetia cod toerts. oF Yrs 

deme snoitiéaes of etats ed? aid ytlo aunt are 
bs eham asf to sueitenimaxe novec€ odd. be fear 
ot ei? beaisacoes ed senesed nae ad cokeguiett. I 0 
i feouhe of betiqgs hodtom oititaeios sat to. 
3 ,8tTe wen 2 te-g¥eb, odd Bs txogex ent 
E | | nosed est Betibe noeM eoaseh ,esitai-tedse eis” 
= esoiqes bedelfdna ed. , ianlbelseq Oiae at sfostenet 
ae ttietdéna edt tas-3l besancelh has twqek ant sort 4 Hh: 
~ stuisq nasi ,otnenmes aia ni- ..atguel ts eneit f tee a 
wn aslitegss tei ef scitsciasxe dedtise ed? einidd ed t 
: edz fee ten atnemmde abe »rosza0ebetg Lars § z. 
: teat gidunemeatete Ldeottenstbe teerg elad.to es 

aouge Leyak sat ak <sotost tnenoo 2 dose.ed oF sar | 


7 -foiteoshs to edqowoling neoltemh os 20 send Be 
| -eeoifot oa exer ,bealragmsa ylteicd-,stismizEe otal ys 

Ioltseqel, 82.22 i 
atalodoe edt ot testst si sl; il z ae 


sa 


dusevod? exon at $26 “ies i 


dopo? ey te senetetse¢al abal otto -atustesg 2! r 


— ee eS $e ee ee KN ah BES REEL, == 7 
= “ees a 


sre aad? ..2 fiant® ,eitare ‘ans ».8 eles: 
-.00 took bigel’ 5. .age eRe 4 ped stein 


5. It determines teaching efficiency 

6. It prevents favoritism 

7. It makes the results available to all 

8. It reveals the ease or difficulty of 

(13) 
the questions.” 

Mann concludes that the superiority of the written exam- 
ination over the oral method was so clearly demonstrated 
that no school committee would ever again venture to 
return to the latter practice. 

Some years ago it was decided to repeat the Boston 
Tests in order to secure data that would allow for 
comparison between the relative advancement of pupils then 
and now. The 1845 test material was analyzed carefully 
from the point of view of present-day conditions. It 
was found that many questions could not be given to the 
pupils of today because of the shifting emphasis upon the 
aims and objectives of education. In general, most of 
these questions to be deleted merely involved pure factual 
knowledge. Selection was made of thirty questions, five 
in each of six subjects, which seemed to have possibilities 
for twentieth-century children. The modified test was 
given to a large group of grammar school children in 


various school systems throughout this country. 


(13) Caldwell, Otis W., and Courtis, Stuart A., Then and 
Now in Education 1845:1923, pp. 7, World Book Coil, 1924. 


ret obts S. | satdoget oe red Do 
eee steeven 


1s ofdarlews stiaees wit pedi we 
} eleven 2% 


Ya 


noidssfem™ s 


2 


-pitcot dua Kit 
> a > 
iTne o-dd : 


qoss - 


‘cougnout enpseye ieodoe a 


ee oe mean te nea 


"Phe outstanding conclusions from the Boston ‘ests 
of 1845 are these: 

1. Present-day children tend to make lower scores 
on the pure memory and abstract skill questions 
and higher scores on the thought or meaningful 
questions; 

2. the changes which have taken place are general 
throughout the country; and 

3. the efficiency of present instruction, even at 
its best, although higher than in 1845, is still 
far from es ay in 

The Development of Intelligence Tests 

It is safe to say that intelligence tests as we know 
them today have developed within the last twenty-five 
years. Intelligence tests, one of the most valuable 
tools of the progressive educator today, are a gift of the 
psychologists. "They emerged from experimental studies 
of individual differences. In England, Galton was studying 
individual differences by 2h ae to Hildreth 
the concept of mental tests was introduced by the latter 
writer: "Modern education and the science of child study 


are greatly indebted to Galton. He undertook studies of 


individual differences in imagery and sensory capacity, 


ew = @ ewe OM ew SOO eS Se SO KH SS SP SOS OMe SB Se SS SSE SE ST SO eS SS SP OSS Se OSS Se SSS 


(14) Caldwell, Otis W., and Courtis, Stuart A., Then and 
Now in Education 1845:1923, pp. 85, World Book Co., 1924. 


(15) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 9, Houghton-Mifflin Co., 1930. 


. ie. FP ». 
etaet ne Leng add ine enotastonon, % 


: asicoe «evel stan of 6n62 rechdl yab-vnene | eS 


* 


oN 

i 

= enoifesces [line foeufads bis yromoat oig ead ab 
a 3 : ‘ Nee BP 9 a 

e ivieuiaees 10 sigsod? sat O0 ses908 aeigta faa a 


we ; senoliteonp. ure 


- iaishes. ew soslg, reds? evnd, debie-eegpade end Vy nea 
ies | bas guxcises eit Tagdesoids 
Sak tz aove ,coivesstenl vnesesy, to gometeli re) end jee es : 
s ifthe ei. ,2b@l al aati? iy id apvodiis: seed ‘art Pras, ; 


) 


oe fl F 
* ~sotrateltes aext tt 


aa elesl sonentiiegal te Tageqo le 
a rOnd eF S# eiuos eonegtiletalcTady yee ee Siess Lat . 


. ovil-yraows tegl ed? aidtio ieqeleved evan yabod | ti 
“i ; *, Ee 
: 7 (a 


cicsuilsv teem edt to. one .asest opieali£etak.. : 


A 


fo ttip s -o%s: ,yebey tot ssubelevisieayetg eas, +6 aie é. 
6 a yf 

bice Jatasmiteqrs aoxl bopieme. yeadD*) Setetpe Ree 

pee sotfiend .pcsignS® nl. .ae0enete tris foobivbbad 

(34) 

i oF sntorecod +" COGL Ee eoonemettLs Lan bty 


é. Bbensbenint dex eieet- tai cet to -¢qeer 
fildeo io 4egetes eat Ane wolletdis atebekt) oe4 
ee lista rvootrefias 6 .netLev.ov beidaial video om 


[leageo Yaougee HRB yaztaeme af esnnetsthile fans 


a 
ee en si ep ie Sok eh ele Cn oe A are ah a eae nail 


P 7 ; 
) , A tuanee ,eltaped bos 9,0 ef70 /rkews iat 
ci iret 60 sae ;S8ergeer scliaasil. Ae 


2xF nes Tish mi shombelt: Steiem | vk Pe : 
sO ee: > «00 at he ieee ik 


i, “SY : _ vipa a 
in. fd gS NO a Se a ae 2 


17 | 


collected materials on the problem of mental heredity, and 
laid the foundation of biometrics. He introduced the con- 
cept of mental tests and developed laws describing the 
distribution of mental pacha a! Sides studies were continued 
in America by Cattell and Thorndike about a half-century 
later. 

It is with the name of Alfred Binet that the beginning 
of the intelligence testing movement is usually linked. 
Symonds says, "It is perhaps unfair to ascribe the beginning 
of the movement to Binet, but what was done before his 
time seems insignificant beside his contribution. Binet, a 
Frenchman (1857-1911), the son of a physician, was a genius 
whose interests were both theoretical and sitlonaniae > bn 
about the same time (1894-95) that Dr. Rice was experi- 
menting in this country with his two spelling tests, Binet 
was working in France on his mental tests. The Binet 
scale was finally produced in 1905. It is an individual- 
type intelligence test; its administration is so complex 
that it should be given by a trained psychologist. 

The Binet test was produced originally for use in 
France. Later it was revised so that it could be used in 
this country. Symonds briefly describes the tests thus: 
"These tests were a set of tasks to be performed under 


controlied conditions and the responses were more or less 


(16) Hildreth, Gertrude H., Psychological Service for 
School Problems, pp. 8, World Book Co., 1930. 


(17) Symonds, Percival M., Measurement in Secondary 
Education, pp. 53, the MacMillan Co., 1930. 


o's jaa ,tilbered Letoexr 26 melietg aad as 


in 


~n6o0 ed? beor Sorat ef »eotitemote ho nelves 


St 
D 

is 

_ 

f 


4 


aitdircdeh apel beqo feveb bre BtS8e" Ladnom® 
‘3s } 
. fenaiténss oxen colicte seedt “etter iainea te noline 


“ay 


yuri nestled @ trode silbireds “one tiesteb xe ath 


a 


AE CTR eee Boe 


bexatt viteres ei tiemevros: Satvset couse Lieeat 1 oe 


= 


witnniesd ei! edifices oF theta eqaiisy ei #I* yee! ad 
pis exoYed snob ear fen dod fontd er cnemevom 4 
# ,tenif@  .colindigeate sia ebieed Samed Stee beste 4 


eitces B eAW Neiobayig & te aem Gag  euceteueeein + 


SEE Le 2 eee dey eat eae 


‘ ieody cauy. baw Leolteweid Avod cs cae 
-itegxs gem golf .c6 ted (60-bOGL) omtd) anes ect veeea 


z -~ - 4 

a3 ay 

=; tani€ ,aiesd aaliiees ows ett adiw gueness aide al sade fe: 4 
74 


a ‘onld od? \.etéed Kefnsm (eid ao eonatt a2: salstom 8 
fachivtici ag et Zt 8200CT ui pesbboug €l isc ft ent ese 


. xelanos ce et nebsartafeiggs efi- ;taad oanegel han 


- tout efacd att geditnsot ieee ‘xd absoayd 


— t8bag tesrretreg ef of Steet? 2) Fee s oren ainnilh ‘BE 
ius. 42r- ae 


nest re tom ofe® seenogdes edd Saez omeitthwases 


eee Ss ee ee ee : 
oat a 


to% Livsek Lootegtenocet nn enaad 


198 nt temas qmelt alt Lexioue’ 
OG0L 5 0d ee AT Po.S ee; a ’ 


i 


‘ > 
igs ke To » hd 


at te tee i . i e ay 


18. 


defined. In 1908 the first scale appeared with tests 
grouped in age levels and the mental diagnosis was given 

in terms of mental age. A revision, essentiaily the series 
of tests used today, appeared in i weeks scale that 
appeared in 1908, is known as the Binet-Simon scale for 

the measurement of intelligence of school children. It is 
called this because it represented the combined efforts of 
Binet and his co-worker, a Frenchman named Simon. 

The first English translation of the Binet-Simon scale 
was made by Dr. H. H. Goddard of the Vineland, New Jersey, 
Training School for feeble-minded. Goddard was stimulated 
to experiment with the new tests because of the urgency of 
dealing with the feeble-minded. Symonds contributes the 
following: “Intelligence testing is the resultant of at 
least five converging movements, some practical, some 
theoretical. Perhaps foremost of the movements was the 
very practical one of dealing with the EN od 
It is interesting to note that the first English transla- 
tion was used principally in feeble-minded institutions, 
prisons, reform schools, and juvenile courts. ‘The tests 
were not satisfactory for use, however, with American 
school children. "Goddard was the first psychologist to 
make widespread Fong the tests with school children in 


the United States." 


(19) Ibid, pp. 53. 


(20) Hildreth, Gertrude H., Psychological Service for 
School Problems, pp. 32, World Book Co., 19350. 


eater & ¥i pet 7 She See {fo Led Re & rOSB pees 

eo) £ ; * wi pat Ke 
me test afaoe eta? . “ILCi nil sexseedce dani. rag 
. 10s efsce vomil®-fantd eds es AR DaS ed BORE ald 


ia 
ah oo <nsiiitds feoner 26 ocnealiiaoas ‘to nore . 


pin 


> st#yotie fenténoo eds betdesaaged 22 enoseee aldtes 
A’ e 
CT ge. 


A snpmre Semen aan ores: retrewson ‘eid fie i 


noivefacett detiga® certs edt 


» 
* 
) 
~ 
~ 
. 
+ 
oo 


> Vo3isl. wel ,soefesd} od? to) Stebbome a see ve ‘am 


f bevsinumits, sev baabhe?®...hebsia-=ofdeet cot leone’ x 
F r ih: ax 
ie poi sit to eeueoed avec? wet. edt wei: tneairoe ‘ 
| adizinde ebsomye ..beSaiawatdees edt atin 3 

te to. tosettiovar eittial nitest eomspt ktedal® . + s0iwoi8 
: sun? , @olvestg ewes ,atasmevom aiidazetaeo evit ‘tee 2. 
ad 
5 cit caw oinemaven ede td teomexo? agadepg. -leobtexoas 


cory 
+a! * 


" Hebnio-oefdset sii ddig aatiaed. Xo. elfe aap Pi 


,ao0lturiseal Sebsim-eidess. ni. yilagionitg ean Ban cic 


estes otf .stxneo eilaeyul Soa, eloodoe corer anon 


| nasliacdk afiv:,sevesod ,sei 20% wor oBteltas Fon: 


cs 


. bei 
. ¢ Gs uinoledeyaq faath et es¢ itaé50p" carb tide Loos OB 
(he al aerbiigo Lootioe “¢ tw, atest ede “toa. bapzgneate, 9 “ 
" aetere bed into 25 


Se 1 ea a ae meet ne ee ee seem nae hE TO RS AS at SOG . 
P 
; 
» « 3 f 


tok. soivrer Pes pte 
-OSel +9. soe BL mew Be 5 i 


a 4 
a - mrs , et 


"In 1913, Dr. Lewis M. Terman, of Stanford University, 
began a revision and extension of the Binet scale. The 
Stanford Revision was published by Terman in 1916, and has 
been widely used." In fact, Terman's chief influence on 
general education has been through his construction and 
application of revisions of the Binet scale and the 
interpretation of the results in the interests of general 
welfare. 

So far we have been dealing with individual intelii- 
gence tests. An individual intelligence test is one that 
can be administered to merely one person at a time; it is 
contrasted with group inteiligence tests which are capable 
of testing simultaneously a large number of people. The 
name of Dr. A. S. Otis is particularly noteworthy in this 
connection. Symonds says, "To Otis properly belongs the 
credit for compiling and publishing in 1918 the first 

group test of intelligence as a measuring i ne 

Ruch and Stoddard give the following summary: "Just 

previous to the entry of the United States into the 

World War, Arthur S. Otis had been working at Stanford 

University under the direction of Terman on a test of 
intelligence which could be administered to large groups 
of people at the same time. With the entry of the 


United States into the war, Otis's materials were placed 


(21) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 10, Houghton-Mifflin Co., 1930. 


(22) Symonds, Percival M., Measurement in Secondary 
Education, pp. 56, the MacMillan Co., 1950. 


a 


ae 


SY o: wv ala tmbe ¢ 


, 
Ae Bi . , + eonosiilednt qdczg iti ft hes 
i ~ 
7a: / = wl 
; ‘ 
>. 
:; ; viel a Yietoenedinals. weite 
Ae, © a . 
<— th 
: m 
-3 ; : OwWwadog ite lool fag -8d 20.8 
i. . - 
w= 
a 9 e in Pe _ ed efor iv 
; + >? 
oe LS i ; tlasy of mis 
: 'P Sel F »Hieati Legs 
4 P| ) af » 4 ", Pps 
~ > “> 
Y TIEeO? 
~* - J 
eee - a» + Ly Ai ; ao! ye 2 
r ~~ 
¥ 2e - COULD . Ore 
r ‘ 
: z .% = pA SAS ; ‘ i209 Rw 
1 1 
is yazIae of 228. \emni> oases and te olgoes a? 
% ’ 
ay F y A 
: ela exee elaoiretam.e°sito ,t8¥ Gar oonk. sotsr® t a. 
> ee es Senne — en on aie amit nie Tet teem: 
,  ueee® 4 
Atel adi igh i. phen fek , SXO 288. ae. gees, 
~ ry ¢ ~ a i‘. oF *", . 
sGEZ » lew ne£ eeet Oot Pah ie hinder 
nk) 
al 
Urehbscose of IaemetsS 


ent, SR foriy, ere 
° DOS oa s05s teal add 


O. 


at the disposal of the committee appointed to formulate 
Mental tests suitable for the examination of soldiers. 
The Army Alpha tests were in considerable measure the 
result of adaptation of the Otis axteniaiéoes ad the group 
intelligence test was born. 
Rise of Standardized Tests 

The first attempt at standardized objective tests is 
associated with the name of The Reverend George Fisher, an 
English schoolmaster, in 1864. He prepared a Scale Book 
which attempted to differentiate between different levels 
of work in composition, drawing, French, grammar, history, 
knowledge, mathematics, navigation, practical science, 
Scripture, spelling, and writing. These scales enabled 
the examiner to assign numerical values to the various 
Subjects, the highest being 1 and the lowest 5, with 
intermediate fourths between each value. We can see that 
this attempt was years ahead of its time. Lang says, 
"This attempt was too far ahead of the times to have any 
immediate Se, - e the Boston Examination it was 
so far ahead of its time that educators did not grasp its 
vital significance. 

The first attempt at standardized tests in this 


country came exactly thirty years after The Reverend 


Fisher's monumental work. American educators are indebted 


Se SSeS SG OS eS SSeS ea OS SSS Se ees SSeS SS SS Ke KH SK SST SS KS SE SOS SOS ST OS OS SS 


(23) Ruch, G M., and Stoddard, George D., Tests and 
Measurements in High School Instruction, pp. 3, 
World Book Co., 1927. 


(24) Lang Albert R., Modern Methods in Written Examina- 
tions, pp. 12, Houghton-Mifflin Co., 1950. 


x 


ba 


. etafvearet of Ssettoggs sosdtmmoe, ott xe ize 


a's 


<Rrebhfoe to noivaniuexs pil? <e% sleatise Bi sot £ 
. id '. 

one STiasec Cet shies ft eter efaede seat. _ 

on “.cfalsstant ett ent wo noitargebe to tai 


Bs ie cag 
ipa naw teot souegitt ome 


eiver ene ee 


CSF AT Ge oe ae 
ws 
J 
Th 
< 
. 


al 


ee 2a n 
b 


ere 


zn bisod vitoutda Porth rebaaes Fe & Gue2 ve tettt oat 


Rs feens EAyvsood fnate vei SAT 10 eften end atzi¢e butane 


re an 


‘ i ~ ya 
tooe sincsk a bexegenta eH 2.808! fi. se8 een Coolten) dei Ran 


os Vi 
y 


<a Pye 
al a a : » ‘ = A ‘ i 
. giavel dasietiib aesvied statinerertla oF rerqueste a oF 
7 4 
= : 

. : fe tomers ,donest yjantwerth oott Leogsios ah : 

= 


op Lavitoase ,soltepiven! eeltaandtel “950000 


“reer 
vv 


iaoe epeit  ,saiview baz, ontiteqa\ outa 


>| Wee 
=. eyoitay sdt e¢ denfev Leottomes Hgiees oF seaimexe ¢ 
a nvifw .é teow! ott tras {£ saiod taqunid ed? avostdial 
— * ee 
a , W 
, fut? as@ oso of > .emiav deoge neewded et2svet statbear 


* eyes sad .smid sol to fhsete sxeey car Feed 38" 
ne ¥ig aven ov semit eit Yo bsaens ae oo? say ee 
¥ . fea} 

if sew tt noltaentimasd wetaet eat ottd *, ocaeuitat erate 
- JE qnese ton Bit etetaegbe Tad? enlt BEb Ye beois tet 1S 
-ooneodt ing te fk . 


eid? 1 ettet beztitabnata te tqueste Petts te 


~aae? 


4 


pxeteveh eAT teite casey Yrusd x ylrouxe ameo 


haddebai e+2 etoteasibsa apottemd  yat0#® totnomion ah 


ee a Pe <n enna n ve agit oe haere ne ae 


, re y 


PE a a de 


_ 


bee staef ,4¢ sgyoes ~biahbers ore ae 
2 »¢@ \aolforErpal fnodet- rm si eda 
“0 “3 ees 


-~Biltexd nertive oi esoarer xb poll. of me 
~OReL «08, RiLTEAenOdtgMOn et 


Co = 2 Ss Bu “4 


this time to a great educational pioneer, Dr. J. M. Rice, 
who devised a standardized test in spelling in 1894. Ruch 
and Stoddard say, “Rice is probably entitled to the credit 
of having produced the first educational test, for as 

early as 1894-95 he constructed two spelling '"tests"', 

one in list form and one in sentence sei Later he made 
similar tests in arithmetic and Pee dei eee e ss spell- 
ing test consisted of a list of fifty words, and he went 
around to the schools of the various towns administering 
it. As a result of his testing, he concluded that pupils 
who studied spelling fifteen minutes a day for eight years 
did as well as pupils who spent thirty minutes daily for a 
like number of years. The reader will note that Dr. Rice's 
work Was going on at the same time that Alfred Binet was at 
work on his mental tests. 

It is fortunate that about the time that Dr. Rice was 
carrying on his testing program, E. L. Thorndike was a 
student at Columbia University. When Thorndike heard about 
Rice's work he was intensely interested, notwithstanding 
the fact that other educators had repudiated both Rice's 
results and his testing methods. About this time, too, 
the educational world was awakening to the need of modern 
testing procedures. Russell has the following comment: 
(25) Ruch, G. M., and Stoddard, George D., Tests and 


Measurements in High School Instruction, pp. 2, 
World Book Co., 1927. 


te xodnsa 


gatos “86 
at ) Reed 
2 elt a0 _ : 
= Sn 
€ f Pot | stro! z at cr oe : 


7.) 


wee 


i 4 ‘baae tq. wat 


~“_=<——= oe ame en me oe aD dO 


ws 
7 
i 


o the 4e 
al atneme" ie ee 
per food bieeh ° 


RB 


"Dr. Rice supplied a new technique of measurement, but 
improved very little on the measures and measuring in- 
struments then in use. It remained for Dr. E. L. 
Thorndike, then at Teachers College, Columbia University, 
to establish some reliable units of measurement, which 

he did in 1904, with the publication of his Introduction 
to the Study of Mental and Social EE Lae 

Mann and Fisher, Rice's work was too far ahead of the 
times. Work with the standardized test was really brought 
to fruition upon Thorndike's entry into the field. 

During all this time the American public was grad- 
ually becoming school conscious. "In our own country 
education for the masses had seen a development never 
before approached. By 1900 it was expected that every 
individual should have a common-school education, and by 
1925 a high-school education was coming to be looked upon 
as the right of every boy and eee is natural that 
this trend would influence the development of tests and 
measurements by accelerating the development of better 
testing instruments. The increased demands on the schools 
made it imperative that testing methods be completely re- 
vamped. Brueckner and Melby call attention to a breakdown 


in the traditional school machinery when it undertook to 


(26) Russell, Charles, Standard Tests, pp. 34, Ginn & Co., 1930. 


(27) Brueckner, Leo J., and Melby, Ernest 0., Diagnostic 
and Remedial Teaching, pp. 18, Houghton-Mifflin Co., 1931. 


reo ,ieaueieesm 20 sip lmioed est beliaana coke sai 

“ib wilwwers® hte cotsessat ene) 26 elects yter B v0 
wh sae ste San handset me | puma. ak neitdi& 
‘lszovltnU: otome{oD ,ogelic®. atadenel 22 aed. od 
loidw ,toecetgecom to otfcn Jidebieoe emes eats 
ktFoobousal sid to dottevtidia end dilw? (BOCs she 
oxta. “ 7opmgtoese Lepeok ine {etna to -xbare. sss Q 
243° LO ERens cet oof ean from Seeks Teel baa" om a 
tdgoexd yilest 228. dee? -becihsetaais ase igiw mr08 » 
ons 2' ent Sated? “agg, pola ined 
-fbtx aew ci iteq saolbrsmg 687 enie mhad Gee ose | 
ecco ave soo ai". .eseboance Legane aodmeeea 
nh emneagolevs5.5° eee ba eeveam sdtosOx estes 


€. 
be 


‘reve tact Setoagxe sev Ti, COCL. <A. ~tegeno xs ae 
bas ,teitaonbe Loonon-tomuo..seved tisode Zan bh 


[2 
o 
| 
7 
nae 
w 


ed. oc saites sew ad Lamorbd Loodee-tsitt 


(TS) 


tait fetstsn ei tl. *oftie bie yod yxere 20 Jaaee exit 
sveot to Mreacolet6h .d? etnosixal Aioew baert } 
;3ifod Lo tmemgelarveS ott snitaseloocem geé sia 


elooidee of? no sénsmet  bavestont oil Stasssttest ge. 


etelqme: sd ebonvon aaltesd saddt evitsiegms PE > 
SeOhAseIG a Cs ooltnedés {leo eoiiel. buns veakdoen Te 3D 


of deetishaig ¢2 cede vxeoidoaa tueein Leng ithhest= 


be de te Oe en Eee eee me ie ee ee 


,-oo $ Antd ,28 .qq ,oseel btebast? pieftedd stteoass | 


sitéenusiG , .O te6nx8 Pe ine, <b ood entousa 
; ellirhi-cetigsok. ,6f .¢q /gnddtomet Iatbenes baci 


educate all the children. “Evidences of this breakdown 

are found in the studies of Ayres, Maxwell, and ceenivensy 
At any rate, the need of diagnostic testing and remedial 
teaching was brought into bold relief. This need has per- 
sisted and still persists, and at present, conditions are 
aggravated by the prevailing economic depression that is 
forcing back into the schools low-ability pupils who 
ordinarily would be out working. 

Between the years 1903-15, Dr. Thorndike and his 
students constructed a number of tests and scales which 
were validated and standardized. Some of the notable ones 
were a handwriting scale (Thorndike-1909), several arith- 
metic tests (Stone-1908 and Courtis-1909), a scale in 
English composition (Hillegas-1912), a spelling scale 
(Buckingham-1913), and later two reading tests (The 
Thorndike Visual Vocabulary Scales 1914-1916, and The 
Thorndike Scale Alpha Two for Measuring the Understanding 
of Sentences 1915-1916). "With the beginning of the 
Stone Arithmetic Tests in 1908, and of the Thorndike Hand- 
Writing Scale in 1909, began what aap termed scientific 
measurement in the field of education." The establishment 
of standards of accomplishment in the field of testing is 


attributed to Dr. S. A. Courtis of Detroit. Modern teachers 


= ee we ee oe oe oe ee ee ewe ee ee ee ee oe ce ee ee ee ee ee eee ee ei ee ie ee 


(28) Ibid, pp. 18. 


(29) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 13, Houghton-Mifflin Co., 1930. 


: “sneabstae at a 


acxrost ice , iewksa HOUTA +0: goteare edd- at 


owOitesni efit to ovonesivn® 


feifenst bas acitest oftsodpeth Bo Bean ct en 
~tsq°8#a0 beon sial .tekiex 6Lod otak tigeoed baw 5 


et jad? -colessiqes siuen0ds bhava add: td besa ma 


odw sitagsq yfifids-wel elootcea Sav otat aoad s 


Hnisrow tao ot fiotw x ign _ 
Hid Shp eiidbuned? sof. SLeBeeL its, og soomta® 
ae 


inw enleosc Sas. steed 15° s80nm. 6 Betousteneo sntese 
$, i“ 


Kaitig 5 Léb2'cc: eke 36 eae eee bezenttay 
-idiue Toveves | (R00f-s2l bacon? ) efsoe: ake iewbned “2 
il ¢isvs 2. pbeotl-eltveg> fas 30GS ose } atuet ota fe 


¥ ote Ao 


{ ats RK, f 


fave salileds @ ,jaltt-eage tise) nottisoqnie t 

afi?) etaev solbese off teted bag) {6fei-maa 

edi gfe ,Al0i-al€f se leot qreiudgecy Laseky ag 
untinestevaba. ait sofas +08 owt miqlaé efac® ext 
edd to srttusined off atye* »(alel-sfer soon tne sai 


efret atibaecdk! eft te fae ,@0er ai etest) obtemit tra ; 
Yau faiw ceped ,eCOeOL mr afsob wah 
: ay | 


;% + H ws 4 
fii FTUSiSe Gk 4? 4 E 
os 


. 

“. 
~ we 
ww 


yamiaiideses <i? Sinottotthe to blott edt ol tosmersy 


saiszes to bfed?® sat ab tacanial f que oos, 20 -ebee 


eratocast? cxpioli .éiunéeGd oho eisaseo ak. ae chnghe 
ceenne eae oat eawnatenas on aera nm it ee 


és J al , OL a8 a: me 


~suimagS ste 7 in gh. abottews State ak 
ibeeil . at name at es See: 
tae 


are particularly interested in the norms and standards 
published with the good standard examination because they 
are enabled to compare the status of their group with the 
standard. Russell says, "The establishment of standards 
of accomplishment in the same field, indicating degrees 
of educational advancement in that field on the part of 
school children, using the same measure for all the 
pupils, and thereby making a standard of accomplishment 
for the various grades, was a big Se 
Odell lists three factors that are of prime import- 
ance in the development of the modern test movement, viz.: 
"1. The considerably increased interest in school 
marks during 1910 and the few years immediately 
following. 

2- Another directly influential movement was the 
development of school surveysS.....-+.. The first 
survey to employ such tests was that of New York 
City in 1911-12. 

5. Several important periodicals began to devote 
considerable attention to test development. 

The Teachers College Record, the Journal of 

Educational Psychology, Educational Administra- 

tion and Supervision, and School ogg = nae are 
” 


noteworthy in this respectecececeres 


(30) Russell, Charles, Standard Tests, pp. 38, Ginn & Co., 1930. 


(31) Odell, C. W., Educational Measurement in High School, 
pp. 55 and 36, The Century Co., 1950. 


ae 


abisSnete bas emton ae ap bes CL 


auc , 2% mc emen edt sateen ,nerb Mas foates 


Bi 


: | att bt spinete 2 idem ydoeueds ch ‘mld 


oP 


ae 


»BaZ 


ek ~~ : d i. tee 4 - 7 _). fe. $ ‘ao 
a3 aitdonfint yiteerh® ted tonk a 


: . 
ae exif off 7,.+00.5<SyevQss feotoe ta tnemgeleves vee 
or . oe 4 . «. = — » . i; 
"he a Ee 3 r¢ Pe ar [See BOs AS ‘ i Od yevises 7 
coat ee ; 
ak atin a al p - 
ete -8f-if0L al ye ht \ 
~ 7 
> as 47 . + 
: 2 : é si TOGqml JATEVSS 4 
: > Oa TO a ei gaie .ienoe th @ 
7; 
a 
. r ™ . 4 > 5 
; ; dt ,be0008 fed SrenoaeT eat wae 
= Ae hy) 
-3 a » ; iit. é the t J a4 & e ¢ os i ‘. rat a ‘oS » | Hest 5 eco ba 
v =i9 
e 7 : - , ’ 5 2. inte 9 ie i 
eta Yre.o0 pce {00N0G Das , c0Leévaegee Vase aolv 
; Ea re ee 
~ f 
wee ENS) ao , ‘ + 
‘peetevdwns a ev +2 at = > Diu ef v fd Yas yowoTon tiv ; 
——— ro a ee Se so a 16 a ino ob eg ee sin edehn MR ge Meno 


‘ tO 


ae lo ee he : oan ee ee Kad 
~vGU 2 hig ,oS «gg ,-ReS8. fa ian aetG, POLED: 


Ee 


; Pal ile: 
,f£eedoe tip tf. ak oneae <ne26 ol fanei< Boebs at x 
- - Oe; Ci . «OD + ror) wat 28. 


ay 
a 


The reader can see, all of these influences had their effect 
upon the new testing movement. 

The school survey movement came into being in 1910, and 
has contributed much toward the supervision and improvement 
of instruction. The first school surveys gained great 
publicity bdecause of the multitude of crudities and defects 
in the educational structure that they exposed. Existing 
instructional methods were challenged; the complacency of 
the "stand-pat" teachers was jarred. Now it is evident 
that the school survey has come to stay. Progressive 
teachers do not fear a school survey but rather, welcome 
it, because it enables their classes to show off to 
advantage. "As the school survey movement developed it 
soon changed in character from an occasional survey made 
by outside experts, to a continuous survey of production 
made from within by the superintendent of schools and 
his atatcle The latest ramification of the school survey 
idea is the creation of city bureaus of educational re- 
search to conduct testing programs, gather data, and 
interpret results. 

School superintendents have been greatly aided in 
their work by the data from these surveys. Standardized 
tests have changed school administration from guess work 


(32) Cubberly, Ellwood P., Public Hducation in the United 
States, Houghton-Mifflin Co., 1934. 


oe 


| E st 
* “ a past ce 

t2e tiene San Pecnels tas sad 10. 2 Fae fhe ta, 
Fey Py Are 

.tmemevom- patéwod: wes: 


Pt 
> 


. 


oe « - 
Ars, ~~ > ie eae 


¢ . ey 
i Dier ni anled ott eaiso Puesieven eee: Joodtos an. a 
Ac 5 es ee 


tnawevergm: Ons neigivieque sad prBwer cout betad] 


cae. epee 
maby, Hae 


“ay: 


tHets Henian eyevine Locdee ees Cee 0013 0m ates 


a", Sar 


my 


tifdetat® .bseoexe eset tad? euadosses tanokexieiocen 
ae 


;fernetlians ezer shontem Lanot te 


a 


- 
’ 


— eo 2 
, = 
5 
? 
@ 
; 
an 
7 
a 


my bivs ef i; Wot ,bewee> saw atetomee “teq-tnedeh 
ex 

ho . ; F : 
= vignocgest  .vbte of emoe sad yevane Ioddes oan 
a opgefew ,roiter tnd vevroe Foci S thet tea oh 

“9° 

Fe 


\s Slo woods of nopesie thedé satan ae ons 


ES a BM 


¢ 


gveb tavaevem yorulre Ldoder. one ea” 
svtud Satoleagdo “na most notostede aah tegoaiies 


. > ‘i ’ sod é 
cupitqg ye ev Uke sitenitkenon S oF etegee obleda 


i) ee Sele 
» 
~ 
7 
> 


mS siecccos to onabuaetAizeqas jie “do om 
4 


iA a re 4 


s 
4 
” 
aan 
af 
2 
a 
° 
>, 
» 5 


evrne iscdox eld Ad nottestTiaet seetati ent "tise. 


nie. 262 

my a 

aa | * 
“ 


~9i feaditeurba Yo asteswd vite te netiaexs ‘eat eb 


i= 
. 


bse ated soddeds' Sctetgecg acrepeion’ tonbooe of doxae 


a 
ee. 
c 


MxOW Stuls mort solfvaxnveinings fools Sescauth onal 


TR ERENT Me 


tind ens of agiteacené otidat ya boowl Si. ak 
beet ee peers eon 


a i cd 
ve 


to scientific accuracy. School administrators and teachers 
need more knowledge about the defects and abnormalities of 
their school venture; greater knowledge leads to deeper 
insight into school problems. Only in this way can the 
curriculum and courses of study be defended when the 
searchlight of investigation is turned upon them. The goal 


for all should be a more complete testing program. 


+ 


C. THE SOCIAL-BUSINESS STUDIES OF THE SECONDARY SCHOOL 

It has long been felt by commercial educators that 
the commercial curriculum should provide something more 
than purely vocational skill training. During the last 
thirty years it has been increasingly manifest that new 
Subject material should be infused into the commercial 
course in order to enrich and supplement this training in 
skills. The social-business studies gradually were 
developed in order to supply the background of economic 
and legal principles and knowledge so essential in the 
building up of a social outlook and a social philosphy. 
Dr. Tonne says: "The social-business subjects must be 
justified in the course not because of their doubtful 
alliance with the vocational business subjects, but rather 
because of the contribution they are in a position to 
make to a more efficient economic education for the second- 
ary-school eG 2 | 

Now the question: "What subjects are included in the 
social-business group?" There is no exact agreement among 
authorities as to what is included. In dealing with this 
group of subjects it must be remembered that there is a 
paucity of reference material. As Dr. Tonne expresses it: 
"It can readily be seen that the available printed object- 


ive material in the social-business subjects is very meager 


(1) Tonne, Herbert A., and Tonne, M. Henriette, Social 
Business Education in the Secondary Schools, pp. 28, 
New York University Book Store, 1932. 


ed yYiibest Bev: 


Tyev al etootdre siend+Lsices ete ai falisisa 6 

= a gt a -— a a = ee ee eee eee en tee nee an 

. 7 a gin 

oor F Fee s»& tredteH on LOT 
é pipet ats seveal 


28. 


indeed. In all these subjects there is much opportunity 
for good onlin adiee ds good books have come out 
recently, however, that deal with these subjects. In 

the most recent one, "Commercial Hducation in the High 
School", Professor Frederick G. Nichols includes the 
following subjects in this category: industrial and 
commercial geography, commercial law, economics, business 
organization, advertising and salesmanship, history of 
commerce, and junior business training. Dr. Tonne, a 
leading authority previously quoted, would include all of 
the above subjects plus business English, and, possibly, 
short courses in marketing and banking. The only real 
difference so far is in regard to business English. 
Whether or not to include this course in the social-bus- 
iness studies will depend on the aims and objectives and 
teaching methods and materials employed by the teacher 
giving it. Then, there is a question, too, avout the 
inclusion of the first year of bookkeeping. When this 
Subject was taught some years ago preparation for voca- 
tional efficiency was stressed; the skill training element 
was predominant. Today, it is different. The emphasis is 
upon the teaching of the principles of business through 


bookkeeping with the skill training as a secondary object- 


(2) Ibid, pp. 224. 


et dx 


: ee pap se <- sls 
> BISA _. TSTRCSSA -.s otea 


ers jiesittev hs inoltaeina 


: t ye ye eatae il 
ig ania strootane ov@ 


P a 
Tozyem Gil sasis oo * 


* 

> 

‘ 

« 

oF 

a 

‘ 

4 " 7 “4 

= pe E- . aU t w os LO 


e egeabh ffi 
7 
—- e 
a if mo 
— . ~~ . Aw 
’ 2 al ” 
- + 3 m~ - sd 
« : ? 23 “w & ? a 
» > , o2 cagese 
a 
. § ‘ — ° ‘ - - . ry - ~e | a 
te Z Ae is Shu ;aeSettat heSy ast Peftie leno: 
oe 
. 3 — at 
. Li’ faa ~ - es 7h ee Nan emceie) 
3 a | 
% c : £'¢ s iJ 20 REL RUSBGe 
t i<fe ans ta Pe % 
ve 3 ~ as Lave 3 toile S 
* > | é 
£ ,~,RBVIEGOR * b4 sGQwery Fads © of sat sids™ vie. ae: 
~~ om Se ee ~~ et er ~o ~~ eae ae oS aa et asses een seh ee 


ve ap 
} / 


863. LF 


29. 


that both business English and elementary bookkeeping 
should be included in the fold of the social-business 
subjects. 

It is necessary at this stage to consider some of the 
history of early curriculum proposals in order to arrive 
at the present status of the social-business group. One 
of the earliest proposals of importance was by "The 
Committee of ied 1903. This committee proposed that 
commercial geography be scheduled in the tenth year, 
commercial law and political economy each a half-year in 
the eleventh year, and history of commerce a half year in 
the twelfth year. As regards the direct effect of this 
report Professor Nichols says, “Nineteen years after this 
report appeared (1922) but little progress had been made in 
the direction of getting this recommendation for social- 
business subjects adopted as will be seen from the following 
han: 

Table 1 
SUBJECTS STUDENTS 

BookkeepingescsrccccscccccccccseerlO,5l7 

SHOP CHANG s seccccccsvccccsccsccccs elIl, 904 

BVPGUPITIBG cc ccs ec ccesesesecee oH0l, 5a 

COMMGPOTEL TAR. e ieee a ceccesese 19,611 


Commercial Geography......ceesee 56,616 
Commercial History...cccccecsseee 8,307 


(3) Commercial Edueation in High Schools (1903), Univer- 
sity of the State of New York, College Department, 
Bulletin 235, pp. 5-7. Cited by Nichols, pp. 426. 


(4) United States Bureau of Education Bulletin No. 35, 
Statistics of Public High Schools (1929), pp. 102. 
Cited by Nichols, pp. 426. 


> 
2 _ 
‘ow eee oe 
. * 
4 > ‘ 
“a 
2 og 
> , 
~, T* < - 
‘e 7 7 tits 
{ 
q * % 


ail J 
+ ‘¢ 
wv 
# ww 
o. 
; 
at . 
rt 
vv 


apy > 4% = 

at o1 
t ' 

. : > 7 . 
7? + J 
‘a +> 

a a 

: > a4 

2 s. 
-< ” 
ak we. PEE 
5 GaSe Ss | 

. 

sn uw 

- Pe oe 

Ye ¢ ao? * 
1 

¥ 4 rina Se, 
en ne he 
2 2 way a bh 
nals? df 
‘es wha &*s = 


i aletoilt soe 
ui me id 
Rec ae {hued 7 “De 
it+eac TH 
€ err <-- 
woos fv 6 
Loess Ms < 
Tous aue he 
r% ln mere 


 esevrae 


ee 
r* 
** 
a 
~~ 
. 
' 
5 
¥ 
* 
~ 


we Sais i 
Sone Oot 73 
is i. 
yaa? 
ut oo} 
fs 
4 VE GS@tRres 
‘2 > 
> & ° ~ on ot 
£7FI1OC CLS *# 
; ~ 
ee atnovel 8. 


2 


2 ~ 
- 


Lea ae 
x" 


a phact xod8 
Lit ixwed 


7s eae 


Wes, ‘ 
Ie Oe 


or ws 8 
ee ss 


so» 8 @ 


siproamed - 
{ & ed Lorem 


> & 2s opr ta mat paw wim aeons res BE 
: ce -_'' 


“poe ¥ - 
» 3] nw 
’ Y - 


not rson bt feieseqot - 
oS ate #2 oat. to yie 
Ves -€¢. (Se me Is 


neste gieables 
oifdst.' to eciek 


It will be noted that political economy was not mentioned, 

although there were a hundred thousand students in 

economics which was listed as an academic subject. No 

evidence appears as to the progress of banking, finance, 

and advertising, recommended by the Committee as electives. 
A study by C. H. Marvin in 1922 indicates clearly 

that the social-business Subjects were being badly neglected. 


(5) 
He found the situation to be as follows: 


Table 2 

NUMBER OF SCHOOLS 

SUBJECT REPORTING COURSE 
BOOKKGSP ING. ccccccccccccccccccege 109 
Commercial Arithmetic.......ccoee 84 
Business English. ceccccccccccccee 94 
SEO EA 89 
PYPOWFITING. cc cc ccccccccccccccsece 88 
Commercial Geographyecccccccccecce 65 
Commercial History. crccccccccvece 14 
Commercial Law..cecccccccccccccccs 18 
UPR e 7 ck paciecedeneeeeses 13 
Salesmanshipecccccrccccceccvcccccs 12 
AGVErCisingecccccccccccovccrvccce 12 


A study made by Leverett S. Lyon in ee shows 
6 
somewhat better results for these subjects. But this 


study included only cities having a population of 


100,000 or more. The results of this study may be 


(5) C. H. Marvin, Commercial Education in Secondary 
Schools, pp. 40, Henry Holt and Co., 1922. Cited 
by Nichols, pp. 427. 


(6) L. S. Lyon, A Survey of Commercial Education in the 
Public High Schools of the United States, the 
University of Chicago Press, 1919. 


fenolines ¢on san YRonCoS (onkbises salt 


} mah 


at etseinte bassrode Boas a oxew 


/soltnad ¢ eseTgotg BOT ows an ee 
tinned edd. ce beincseceiikcs anata 
yizaeloc seteciint gsef oh siveall e 2 qo. ybate. 
bad ait ed eran aetooléne exstifend= Imioog t : 
an of of sottegdia BildBt 


S eide? 
sa ae 
OR ITE 


. -oitnondst Ira aL gous 
-eaeee s MOLISE steaks 
vs 0 Ute wee pee rode 
ica «heen aa) 
. ghgergeen Lalovemne 
~ vse sQroteie isloxem 
wawuerry,. fal ote Vu 
ba vob 0 ke opis's + ROLDOREe 
oe oee te vee eQGMeneaee 
> \rteese vee ROLOls T59 


steteved ydebsa qiote. 


4 
} 


Aw 
~ 
a 


oe 


> 


~ ex 


8 
£ 
i 
a 
Sd 


sot atinest tetied sanwel 
vino sobnioms. 


°o elisesx ox ,aTOom 0° 0 
“a eae we @ eer are oo ne we oe eae ten ay “ 
/ mS 
rontdnetny avr, Bal 
<nel , Os eg 2 


782 aq: haaiesas ie 


igistomod to youseel & Peat: 
satinl sdf te eleodad : 
-8i0L ,ederl oyeoldd to 9 


% 


(7) 
tabulated as follows: 


Table 3 

NO. OF NO. OF NO. NO. No. 

SUBJECT SCHOOLS SCHOOLS REQUIR- INCLUDING TEACHING IN 
INCLUDED REPORTING ING AS ELECTIVES COM. DEPT. 

Ind. History... 224 136 31 12 t2 
Hist. of Commerce 224 156 25 6 1S 
ECONOMICS..ceccee 224 136 49 Kor) 24 
Com. Geography... 224 136 94 25 76 
Com. Law. @eeoeee2ese8 224 136 90 28 84 
Bus. English..... 224 136 64 9 32 
Salesmanship..... 224 156 15 25 31 
Advertisinge..ccee 224 156 8 16 18 
Com. Organization 224 136 oa 7 8 


A study of this table indicates no widespread teaching of 
any of these subjects, except commercial geography and 
commercial law, as social-business subjects in a program 
of commercial education. Mr. Lyons concludes that: 
"Social-business subjects, directed and taught as they are, 
sometimes by persons of purely classical training, cannot 
be relied upon to present any definite body of knowledge or 
consistent point of view. The evidence would seem to show 
that no definite point of view has been determined and that 
the results which are obtained from these courses must be 
varied in the ce 

These various studies have caused more attention to 
be focused on the social-business subjects and the trend 
in secondary schools seems to be to give them more prom- 


inence in the curriculum. In general there seems to have 


(7) Table formed by combining several by Lyon in Education 
for Business, pp. 369. Cited by Nichols, pp. 427. 


(8) L. S. Lyon, Education for Business, pp. 382, the 
University of Chicago Press, 1951. Quoted in Nichols, 


pp. 428, 


My 2 on 


Cac nAOM Z. 


~ 


v 


coxeamae’ ol 
ve ens BOLMO 
‘Vigetsose 
a ek 

’ ei 

os. Gidenaae 

ree ve ROLLE 
no ise 53 gO 4m 


Ay Vip Ter te Me 
2 £4.0> te fo 


er 


log inevalagee 


fs 2 
x 


intzss Of 


Bios es toe re 


- 
,~= BSE 


been a steady growth in the pupil enrolment of all of 
these subjects with the exception of history of commerce. 
"The teaching of history of commerce has proved most 
unsatisfactory. Available instruction material was, and 
still is, scanty and faulty. Separating commercial and 
industrial incidents of history from their natural sett- 
ing tends to produce a distorted and one-sided view of 
oo hahed gh hil it is doubtful if history of commerce 
Willi function as an independent subject in the social- 
business group until a reorganization and a re-evaluation 
of subject-matter takes place. 

“Since 1919 considerable progress has been made 
toward appropriate emphasis on the social-business subjects 


(10) 
as the following statistics show: 


Table 4 
Pupils--1915 Pupils--1928 
SUBJECT NUMBER PER CENT NUMBER PER CENT 
Commercial ee es 19,611 0.91 76,434 2.264 
Commercial Geography..... 56,616 1.70 140,246 4.84 
History of Commerce...e.- 8,307 0.59 §, 321 0.18 
PR OBBINL OE 6.0 ¢.0:0:0-nei9-0.0-010.0:0.0 1005540 4.80 147,035 5.08 


An analysis of this table reveals that substantial gains 
have been made in enrolments in all courses except history 
of commerce. The report from which the above data were 


taken does not list salesmanship or business organization, 


(9) Frederick G. Nichols, Commercial Education in the 
High School, pp. 428, D. Appleton-Century Co., 1933. 


(10) United States Bureau of Education Bulletin No. 35, 
Statistics of Public High Schools, 1929, pp. 102. 
Quoted by Nichols, pp. 4351. 


~<eotammos to vrovald- Te neotiqeoce. esc se en 


een 
4g. .@e« tsi ae noitvosutent elds ifleva “Gree foaie, 
is y Pe 


snsoe ad 


tont katt . 


ss gy ~~ om - 
4.8 bl ee 


, ¢ 
- . ‘ ~ , ts4 : 
7 ” eon . ~ i fc’ et. eat t ‘ ae ty r 
rr. 0 28 76-8 i DoS BD: ,1BR1CRS & ites quoi sage ta 
\ 5S 
; ; ~s 
7 rey —S wy | ¥ 
ei On i9 cod tf Sil bids, 
as 
ay 
Se ' - ’ aoe oA Be 1p erer yond 
orn, Ss - i : GR lapoLianc i - 
Pavee 
ee ee 
“<a 3 5 — + Wa Gat 
> } 
= - 
Se 4 Sanit en tae ree 
> WSs i+ VV OA iw) yee 
a 
: 
: . oon 
* fa") 
AS 
: 
4 e { 
4 
‘ o 
Bs ue ae kao vee > ¢ oe 
. ; rf 
? a pd oo awe ol 
o y & i ew + ¥ a 50 Oy 
’ ’ 
: o rane 
2 > VN ¢ orewere Be SP it! 
i] a 
. r ee ee 
4, ea e 1 Sf 4 © 
Bid a Las Sin 
: 3 os nr ried mt 
i J - bie SS Ate 
* 
~ « = asa 
’ a , 1(o7eg Sik 
wt 7 P =e 29et oF € ns +o F F* ae - 
ad so qinenemeeies tell ton 
- « —-_—-— = ee ee ee ee ee et ee doiah ie satenaele 


; Gis Pied 
toreangd 8 ree’ + aniaet 


olvsorta To. pee 
eLoodes agit. ohidull, to 2 


but these subjects are now taught in many schools. New 
developments are going on in the offering of the various 
schools throughout the United States and, at present, it 
seems that the tendency is to regard the social-business 
Subjects as the core of the commercial curriculum. 

There is one further aspect of the topic to be 
considered now that the increased emphasis upon the social- 
business subjects has been brought out. This has to do 
With the relative proportions of voys and girls enrolled in 
the social-business courses. It has been established after 
scientific research that girls tend to enrol in the social- 
business studies more than boys. This holds true for all the 
Subjects except Economics when the proportions are about 
equal. “This may be accounted for by the fact that 
economics is probably considered an academic subject rather 
than a business subject in most high jitette ee 

Professor Tonne presents the results of his study in 
the following fae” 


(See next page for table.) 


(11) Tonne, Herbert A., and Tonne, M. Henriette, Social 
Business Education in the Secondary Schools, pp. 77, 
New York University Book Store, 1932. 


(13) bid, pp. 79. 


: 2. 
of .aloodoe teem az tied won com 


‘4 


, oft te anlvetto ed? af se gator ers heii 


WOM. bez! D 


7 ist - c MWSe6C een at pet cue anes 


ee? 


— 


. ice a j 
£ z pre ha zoek 9S i 
s . t 
: ~ 
a. 7 - 
~ ~ f 
$ «- > ¥ . 


Table 5 - PROPORTION OF BOYS AND GIRLS ENROLLED IN THE 
SOCIAL-BUSINESS SUBJECTS 


Number and Percentage 
Subject Boys Per Cent Girls Per Cent Total 


BoonomicS.cecccese 9,255 48.5 9,852 51.5 19,107 
Business Law....- 12,031 42.5 16,305 57 .5 28,536 
Economic Geo. .-. 15,418 3822 24,934 61.8 40,352 
Business English. 12,676 56-6 21,990 65-4 54,666 
Bus. Organization 1,852 52.8 1,658 47.2 3,510 
Jr. Bus. Training 1,337 41.1 1,920 58.9 53,257 
Salesmanship..ecc- 7,746 48.7 8,147 51.3 15,893 
Advertising....e. 3,255 49.3 3,347 50.7 6,602 
His. of Commerce. 1,613 43.7 2,080 56.3 3369S 
Bankingeccecccece 149 45.2 181 54.8 53350 
Read thus: In the sudject Economics data were 
secured for 9,255 boys in a study of 410 high schools. 
This is 48.5 per cent of the total number of students 
studied. Data for 9,852 girls were secured. This was 
51.5 per cent of the total number of 19,107 students. 
Professor Tonne accounts for the above situation in 
this way: “The fact that the proportion of boys to girls 
is not actually greater in all social-business subjects 
may be attributed among other reasons to: 1, the close 
traditional association of the social-business subjects 
with bookkeeping, stenography, and typewriting which 
appeal primarily to girls; 2, the subjects are not taught 
in such a manner that they will appeal to the interests of 
boys; and 3, improper guidance from the “ge ae other 
13 
students, or from the high-school faculties." 


(13) Ibid, pp. 79. 


oe “ay so aRivomHY cumIc GHA gyoe NOOR 
evoatdue FamR reve LATOOS, 


al " be — i 
a ’ . { - 4 H s* : ee Can, zs ef 4 7, "eer ons 
—" 7 - “2 z 
2 a o> Jee OA VGeyvpe {[s0. - 2 ae “ed 88s, 
> er 2 A poe . = ee. 
‘ ae & s 2 > > > $.¢ * e Ss £ 


ee 
’ 


oy 
° 

* 

" 
. 
po 
. 
a) 
. 

& 
—* 
Ge 
= 

‘hb ret ots mo 


' 7 
‘ ~ © J : ? ae 
, « i — - : q , 
a whey - ‘ -30 44 Hs kG oS . noseuat af 2% ah q 
ba 7 r r . ~ “ = = . 
~ + 7 ts > ‘- -—.. ' “ts selnleee eat et 3 
» 2 . 7 r “ . . 
» - - ao » Liu ort ¢ ees -qidanae 
AT : 2A. ; » OA sa0 © 
yy . . 4 " 4 , « 26,0 240? oGa 3 eve? pnt 
at z xr irs r . " 
“ae 529,28 Gae a; v.82 €f8,£ 368 reamed ‘vo 
= “ “y r : os 7 Gai 7 
7h. % a ~ ~* . P PY i i" °onr 
“r Se 4 ey 5 Bs eee DAle Lie +O Mis 
1 
“* 
he vo 


iy tnehas a bg: i> to tneo 69 a.02. 


a= j ee - w Bitiz Sé6,@ tox we an 


4 


1 Latoe sat 


- a : i 

- “ ii4 L sac ls 
~ - Aa . 
acs. Ye ‘ 4 


7 . 
f 


e=doid eft mock 20 


It has been shown already that at the present time 
the social-business subjects are coming into prominence 
more and moree That this tendency is desirable is now 
generally conceded among commercial educators. ‘The old 
basis of business education, and the present business- 
college system, is fundamentally the developing of skills. 
Professor Tonne cites H. G Shields in this connection: 
"The inadequacy of our present secondary-school business 
curriculum in providing the vocational-school student with 
general business training is at present one of the most 
important issues among certain thinkers in business 
education. H. G Shields, School of Commerce, University 
of Chicago, deplores the fact that business education as 
it is today is really clerical education. Shields is of 
the opinion that the student in the present curricula in 
business education is being trained merely in technical 
Skills and is in no way being oriented in the realities of 
business caus This situation can be obviated by the 
insertion of balanced social-business materials properly 
co-ordinated and integrated so that the lost values in 
the business curriculum may be recaptured. 

It is generally admitted that post-depression con- 
ditions are making it necessary for all business men to 
show their mettle. Conditions are tense; bankruptcy or 
(14) H. G Shields, "Our Clerical Mills", the Journal of 


Business Education, Vol. 4, May 1930, pp. 34. Cited 
by Tonne, pp. 52. 


te add ybeerle awoke am 


ef esneslword oni anige, ets efbetdsu eseaicag= 
at “ein ie eae 
: i ofdsiteed ai ene bao? aint tat? -etom fas 
\ "one 
" 
=< tr 23 ochs Iptotemmeeo anene heaeeiaws 
hi “<8 ia énese qq: sac Bae tteonbe. sreat end: at 
ot ; [eveh ens viletceme hae ‘ei “otsyge 
< a goo eidt gl eiieidt 2 get te sane? 2 | 
c foorve-risiseses tee0estg 180 Toa yoadpe +} 
hnte Ioonoe-fsnoiseeor edz satel read: gl nn ts 
i ae Aes 
rz o one tnsesig te ef gatabers weontaad . a7 
teod uf. eteyain? 'ge6 nDoms eenast ea 
levevloU ,eoxeameD ho [eoote? .ebietas ao BL 
Eo TP aii BrLeage fang “ oat eag ser iged e301 
hpfeis -noltacube fIsolselo yilest al vehor 
e fxs jas va eds of rasbute edd dais nots 
‘geinilue? cl yloasm henistt gated el soivsonie) ween 
stiiticr ott al betnoito gaitad gan on Al ef pase 
“ 5 
tela 
t vo ipfelvde ed nse noliantzie sidt  ".8RKE 
1g vfelielsu ssqmatend=felooe begneisd te pots 
1. <> ti) - , 
ae ; wis 3 on bedexgecnd Bas betent = 
.poxustgeces oc yam mu isoiziws eventent 
“poo nol itqgoh-fecq tats betjimbs citeneneg en. 
c weoniend [La toi yitesecoven ative Salk 
ae 
16 yotanputned ;sened. eta ancizibacd esate e 
- -—- A a TD GS GE OO PO or Sr, RSD ST Oe 
tc feovyol sit "ellie inohtslo 380" wsecehdt 8 
iti. -.de peel Teh 2 ces oh 2 ppongte 


ee ee ee ee eee a Pee 
566 


loss of position faces the unlucky individual that does not 
Measure up to the set standards. A few decades ago high 
school graduates were content to enter, small business and 
"learn the business". Today the lure of romance and adven- 
ture centers upon the colossus of business - the vast 
corporation with its many subsidiaries. Secondary school 
graduates, many times, feel this gravitating force and as 

@ result obtain work in some nationally known company. Now, 
the point is this: unless these young graduates study the 
principles and business knowledge underlying the work of 
these giant combines, they will be entirely unoriented and 

| failure, with its accompanying sense of inferiority, will 
follow. This situation will be circumvented only by a care- 
ful inclusion of social-business material in the commercial 
pupil's course during high school days. 

It is a mooted question whether a commercial educator 
has the right, in his vocational guidance work, to advise 
-pupils to enter the field of business if he feels that the 
pupil will not advance above the clerical level. This 
statement is particularly pertinent in reference to the boy 
pupils. It must be borne in mind that the commercial course 
provides the entree into business, nothing more. Once the 
young graduate gets his foothold, then he must climb accord- 


ing to his own ability. The problem of getting stranded on 


eoods ted? Sanbivisas yioatas edé sebak sos om 
twig fe 

tinid ogm seheoes vet A sebtebnete Tee eds a ‘e 
vs bus aeenlasd Lfeme espe ot tretnoo otew cod an bang: fag 
evie Dbhe ecosmox to exeLledd- yesotas> "eoonlesd cas ase 
hea 
Sene ott ~ seontend to eeesoice add neq -o4edl Ot 2 

A 
coarse werbhagoust -€et zaibledps Ytiea ei I ivi no $s 
se bes oorc? acnivetivesg eliad iget-,eomts eau 40) 
asgncd eons yifenotiss esee- gf Stee aleedo a 
yonte getarhsta Snsoy easnt sepiag setae el. 
iyow odt gotyitsbac epbetieogs atemttesd Snes 80 
us hecne) tons ylerisne, od i five Code. eon) datos RYT 
ciisoitetnt t6 senes gatynsqmeteas 872 agi, >t 
cas © ¢d Yind betoovmilorte od -Lite coo 
faisxremuon até gi LakTse Bm agenisad-isioor ke SOs | 
«ayeb inodoe dgld-sabxosb eexaeo- 8 ; i 
‘oteaguisc [ais sommerd @ testedw aolseosp betoom-s ela ) a 
peivas of ;- eonsbiny faneltissov. eld: as ‘sige ie 
eit tudt? efest of Tt exsdiesd Te bfeis ene rote we pe 7 


sint .fevel fecitedo- eds ssode: Sc nabs ?oa coe 


sif spd .erdm eatdten ,evenlesd o¢ni eeriae ond 
~bvopes dui(s tena ed neat. Siedseot eid sien of sobem 
se fobnetts gaisttes te -mefgorq, ott iitds 290-8 


se he em cme stm ot ete me sy St te esp ‘hte ene se, ee ibe eth ee a ss an eh el, cei i A a ee ee 


a@ lower level of ladder is real and ever-present. Professor 


Nichols says, “As a partial offset against the tendency to 
become stranded on the clerical level, every high school 
commercial pupil should learn something of the fundamental 
principles of business management, become somewhat familiar 
with the functions of major departments of a business 
organization, acquire some small degree of understanding of 
legal principles applicable to business transactions, 
secure some rudimentary comprehension of what are regarded 
as important economic principles, and give a little atten- 
tion to the study of basic industries as possible fields 
with which to become identified later. The social-business 
subjects seem to be the best media through which to achieve 
these EEE 

The query is often raised: “Why do not the social- 
business subjects yield greater educational values in the 
Way of an improved social outlook and social philosophy?” 
The nub of the problem concerns the organization of the 
social-business courses. An impartial examiner examining 
representative courses of study in these subjects would be 
forced to conclude that the subject-matter is not well 
organized. Professor Tonne explains this faulty organiza- 
tion thus: "The usual obstacles, such as a general lack of 


acceptance of sufficiently uniform objectives, the diver- 


(15) Frederick G. Nichols, Commercial Education in the 
High School, pp. 436, D. Appleton-Century Co., 1933. 


efe-tan tape tegtte tating, 2 an 
footsa saith yore , feted facivate “ody BO) bot 
= fetnemebnst off to ao hiseres Si{dole requ ates 
= tadtwicmoe emooed. , ThE aayelan esenigad 10%. soigh 
dean fend a to steontraqeh t0psa Le aactionn’ 1 


tifpetedeiae be setaeh LSleme smee-eniopos noite! 


a a 


(senets exeniagd of siiavi iqqe, eolglontat 


hefiesey exe tae LO. ad an pd Stguo Vist hem LOT cine | 


: bs 
=“fovta ttf #2 evip fhe ,Selqiontr alacnoed re 


pielt oicisesq em tolutex tal, ofmad Re Gheee) vost a 
PeAimed~-feiees. eff itetad beiteiness pea a,* poke 
velivea ot deide dgeoute. aftem, cage: othe 5d. ot wena atoste 

Se ae 
-ielees eit tom of Qa" “tbeskat aor’ so et czenp -edt 
og a eestsv Laagttens$e teicerg oioly afte fspey. 20 


i yAqoae lig - Selooe Ane dooktire fsto@e bevotqml: as ae 
pee 


, eit 


it 0 so bennt maar envy erateones ne Long iY. ~ 3 tore 
ttigexs <échinexs Tabgegent of a lttscha oven ead: - 
é seidde casid nl chose te esernog evita 1 
Siow tor al tetden-dnetdac ent gedt ehs lone ot 
-seigent® Tiidat elidd enieioxe encore rossatont 
Aysi. ieionst < se cose ,seloasede ‘Tones enuf". 28 


-<svib- ent jaevitosi co. wretinn Vitae 2itieg to. 


~_<— <8 oe oe ee ee ee et ee eneienweiminaab NI SABER i le om. 


eft tl fot S297 Be izto xeneneS akedoit 
HERE 08 Wilde D-cod oem cf Sh ad 


586 


gence of ideas as to the content of the social-science 
curriculum, the overlapping of subject matter, unsuitable 
textbooks, and poorly trained teachers, are factors which 
are all present and ea 

Before the social-business group yields its full values 
two conditions will have to be alleviated. The first has 
to do with the grade placement of materials. Professor 
Tonne says, “The social-business subjects should be spread 
out to a much greater extent among the four years of high 
school. In this way they will not compete with each other 
and with the other subjects of the curriculum which struggle 
for the twelfth year, which for many reasons seems to be 
considered the choice eg second condition harks 
back to the methodology employed in presenting these sub- 
jects. It has to do with the lack of correlation and 
integration between the different subjects. Professor 
Tonne points out this weakness thus: "This situation 
shows that the social-business subjects function as inde- 
pendent units, for the most part entirely apart from each 
other-and from other subjects in the ae eee 
conclusion, then, the future should bring a complete re- 
evaluation and re-organization of materials in the social- 
business studies if they are to yield the rich dividends so 


necessary in the fuller education for pupils enrolled in the 


commercial curriculum. 


(16) Tonne, Herbert A., and Tonne, M. Henriette, Social 
Business Education in the Secondary Schools, pp. 63, 
New York University Book Store, 1932. 


(1%) -Ibia, pp. 81. 
(18) ibid, Ppe 88. 


< * , pai AA at We 


7 


Ti 


ant 
~ 
‘s 

® 


Ce one 


@n? -oiseitoxse efiquq tet nots onb¢: baad ead: a 


cont toeslatloon eaé *o teetses odd oF ea Bee 
oftetinesnu ten teatdve To sai qusisexe odd | sau 


it 


oldw exo’ cet ote be Crs gixeoq. bas 8 
af “ 
7 dnenimesg Piast, 


ehleiy quoxa ebeniend-isfooe edé et 


ay 
re 


seniay Liat 2 
gai setit-eit  ,Seteivelie ed ot evan ifie skola H 
segesio1tg -slsizveien to Smemevelg ebatg att 4 
peige oc hintds etcajaise sventiagdeLaioes att ou | 

) aueey 100 etd poems tnevxs select dona aes 


i 
me 
“site dose dtlw steqmco fon (fhe. yeteogee aid’ at foo 


be 
i 
rr 
nf 


ie 
sfsuoxte doldw apigoittss ety %34 av oghdne tedTo wh 


é¢ ci Guess Boones 1 Yana ae: dobiw. ,Tasy £ 
hare solfibheo hacoee wm No eee edi” 
-ign evoedt goitnesetg a2 seyelque so lobodten oat ¢ 
ins noiveieuteor to dost odd Ativ ob of sad a . 
sonssior? .efoeideen SootetIEb eat neewssec not 
noiteatis eld?" - osud¢ seentsem-eldt tae piakie 


-ehai se soltend?t eteetdss easntacd=[eioce sat Z 


a 
J 
G 
a 
an 
” 
qQ 


‘etiine treq J 80m ent 0% 150 bo Si 


ay “.oufsotrtne ed? nl. stostdze «sais mer ne 


n ! ow eh c< 


-or sioiguoe 8 anitd binods eistoet edd nese ae las 
~isigee sat ni-dleltetam to nnlssrtnea teres one % 


of ehboebivib doty ent: Blely ot eve yeas Sk coibare 4 


> aalastztve 4 


care dint dy 60 de ete ta i — aaa ore a eS eee ine eee e 


Ceies® .etteltool od enkel ae ory peer 
4 .¢q ,Sloodot yrekceost eaf 22 notisen Bean . 


SOL ,9u0d€ dood Yoteveviat dx0% © welt 


D. THE CRITERIA OF A GOOD HXAMINATION 
Personal judgment determines to a great extent 

Whether an examination is good or bad. Before any judg- 
ment can be made the type of examination and the function 
or functions it is to perform must be taken into 
consideration. It is obvious that an examination can 
be rated just the same as a story or a picture. In each 
Case no rating can be attempted until the judges decide 
upon the properties selected as standards by which to 
judge. It is essential that these standards be reason- 
able criteria. “Hardness” might be all right as a quality 
by which to judge rocks but it is too unmeaningful as a 
standard for judging examinations. "The qualities which 
test experts have agreed upon as composing the most service- 
able and straightforward standards for evaluating examina- 
tions siete 

1. Validity 

2e Reliability 

5. Objectivity 

4. Comprehensiveness 

5. Facility 

6. Utility 

7. Rapport 
It is Professor Lang's contention tnat these seven qualities 
include all the desirable characteristics of good examina- 


tions. The degree to which they are present determines 


the worth of the examination. 


(1) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 47, Houghton-Mifflin Co., 1930. 


ra ws re ee no ; mt 
HOTTANTHMARE G08 £ 


entcreres ti emg hot ; 


2 ai shottsl otefek ine $¥ 
ee ns . a. 
a ed ni FPS Lit=ne supe ave 
‘ a = 


> S — ‘ee i . ; ‘yy 
i. a ees ee Karts J yes 


Zz 
‘4 
‘9 


= Having decided upon standards for evaluating an 


examination, it will be necessary to consider each one 
in detail. By far the most important one is validity. 
Ruch says, "The most important single fact which can be 
known about a test or examination is the degree of 


(2) | | 
validity which it possesses.” From the teacher's stand- | 


point a test is not valid unless it covers all the im- 
portant items taught. The pupil does not consider a test 
good that does not stress the important parts of the 


unit or the course that he has considered. Both of these 


requirements are elements in the validity of the test. 

Sometimes the terms "Goodness" or "Worthwhileness” are 
used as synonyms for validity. It is of prime importance 
that the testmaker incorporate in the test or examination 
those elements or items that are essential and take pains 
to eliminate the nonessentials. 

Professor Ruch gives a number of ideas that taken 
collectively represent the concept of validity. These 
oe 

"1. Validity is the degree to which a test or 
examination measures what it is intended to 


measuree 


2. Validity is the general worthwhileéeness of an 
examination. 


3 Validity refers to the care taken to incorpor- 
ate in a test or examination those elements or 


(2) Ruch, G. M., The Objective or New-Type Examination, 
pp. 27, Scott, Foresman and Co., 1929. 


(a), 36a, pp. 28. 


anit an Levee wt ab igiione son ‘ponioet 


ics 


Aishoe of YTeeseoen ed iftv Py fit 


of 
* 
‘ee 


: ee 
-Woibifsy ei oa% daettogmt tuca edt Teh Re” ii 


sfoais dast 1oqnt teom ontné 


‘ 


eotasi sit ef nolianipexe $0 2809 a. dudicg 


‘s 


tn »redoset edd aost .* »sgensenog TL AOtAes wha 


fie @unevoo fb eesing bilav ton ef teed 


ahianocs.ton se0B figag ed? stdganet ened 


an Oo ed ag tiadtoant ede gegtta ton "068 


we 


ma 
wit t© d¢fo8 .bersbiagoo Sad 8a F@es 6a eo alt 3g 
s7ESC “2 to viibitey edio ol efaeaeic ose Bia: 


se “evecetiandt sol” s6. epsnboeR >. sages edt oi 


wl smixq to-st a2. setibiiev ges eorynonye, & 


> 


»sleslinerescnon edd oda 
eseh?i to Sdn sb Bevis A#o00r 1esel 


: : ial 
; 


baer ofibiley Te-sqseonoo ens Faee erqet viovtt ces 
i 


to jase? a dotde of estneb edt ef Pelee ae 
od bobmetnt si ti Jedw seteesem solvsciaars aie 
+S iIhasen, ae Gn ‘ 


ge to vecredtawistow Lesetes. odd el gilbigeves 
NOltienigexe® © 


regtoonl o¢ aetst etso ant ot steter wip fey’ 48” 


— 


<o wroewels. seont snoitsnimexs. tro teat & Ok eta. is x 
ES at we Oh LO oe On ee Oe fet A ee Oy et etme ae ae oa 1D ae an ny ED ae Pe Se are aoe , 
a ay ae 
nolisoimex ogyl-re® to evitoatdo edt a a coi 
,O80£F ,.09. ba8 Saar qn 


items which are of prime importance, and to 
the pains taken to eliminate the non-essential. 


4. Validity is in general the degree to which a 
test parallels the curriculum and good teaching 
practice. 

5. Validity refers to the value of the test for 
measuring specific abilities in an accurate 
fashion, and a test ceases, to have validity 
when applied to the measurement of abilities 
for which it was not intended.” 

All too often the classroom teacher is prone to accept 
that a test measures reading ability if it is labeled a 
reading test. In certain cases it has been proved that 
tests were wrongly named, yet the classroom teacher does 
not delve into the information about the building of the 
test in order to find out what the test-maker wishes 
measured or the functions which he desires performed. 

In the listing of Professor Ruch's ideas of validity 
above, the third one is worthy of amplification. Care 
Should be taken to include in the examination those 
instructional materials that are really worth while. The 
most significant items can be determined from the course 
of study, the basic textbooks and what authorities con- 
sider the minimum essentials. In addition to this, 
irrelevant factors such as penmanship, spelling, English, 


neatness, arrangement, speed of writing, and the like, 


should be eliminated or minimized unless the examination 


is intended to measure one of these. A valid commercial 
geography test is one that measures achievement in commercial 
geography and nothing else. If factors other than commercial 
geography enter into the measurement, the test becomes 
invalid as a commercial geography test. 
Professors Ruch and Stoddard give a fine summary of 
the most frequently used validation methods. These include 
the hetanwtacs 
"1. Textbook analysis 
2. Analysis of courses of study 
Se Analysis of final examination questions 
4. Pooled judgments of competent persons 
5. Use of rating scales in setting up criteria 


6- Correlations with school marks or other measures 
of school success. 


7- Increase in percentage of successes with 
successive ages or grades 


8. Correlations with previously validated measures 


9. Differential scores shown by two groups known to 
be widely separated upon a scale of ability 


10. Determination of social utility 
ll. Logical or psychological analysis 


12. Correlations with tests of other intellectual, 
non-intellectual or educational abilities." 


This summary includes the methods used for validating both 


psychological and educational tests. As the scope of this 


(4) Ruch, G. M., and Stoddard, George Des Tests and 
Measurements in High School Instruction, pp. 304, 
World Book Co., 1927. 


~ 


45-6 


thesis embraces merely the field of educational tests and 
measurements some separation will have to be made. Methods 
5, 8, 9, 11, and 12 are used for the validation of either 
psychological tests or trade tests, consequently will not 

be treated at this time. . The other methods are all of 

use in reference to testing in the social-business subjects. 
Needless to say, some of them are of much greater importance 
than others. Inspection of the list shows that the first 
four reduce to the single criterion of expert opinion. 
Methods 6, 7, and 10 are experimental in character but 
these methods tend toward greater refinement than is 
possible or necessary in constructing informal objective 
tests for classroom use. Sometimes a2 combination of 
methods is used in order to validate the test. 

At this point, each method will be taken up and 
explained in detail. The first is the "textbook analysis" 
method. Professor Carlson used this method very success- 
fully in his Series D, Bookkeeping Tests to accompany 
"Twentieth Century SMO Consider that an book- 
keeping examination covering the first year's work to be 
given through-out the state is being prepared. Suppose 
that a number of standard bookkeeping textbooks are used 
among the various high schools. It is quite possible, then, 


that the material taught by one teacher using a certain 


(5) Carlson, Paul A., The Measurement of Business 
Education, pp. 9, Monograph No. 18, South-western 
Publishing Co., 1932. 


nis stsot Lanoitieen ne 20. bier gat ‘_ivens 2b ~ Oeas 
ened eben od of oved Like ngteenseoe, sn ot 
quitte So noftabl tae sid 26%. Benet ad RS: “BGR 
a itiw ey limes pesos: teed ehext «9 ateot a0! a 
ie Efe ete sicdvem tedto eft omke hid: 4 
- re 

tise epentedd-~jetoce sat af gut tne?’ oF voneue te sn 
Socms, Lo Tore jonm le. oe men te emoe eee ot 
jeudh edt ¢and ewode tefl ead to notiveqenl 


| “ae 
a ; a wr tl 
molaige tregzse to Hofvetitg etgote egy ete SB 
7 i & 
tod totosyado at faesnemiusate ere Of baa: we ef 
ob : xh 
nett Splenentie reiss*y prewar Soot | ion 
ne sa ae 
evitoetdo Lamvolni antiéootseaes ah yiseseten, 10s rote 


to aolvantdiee 2 sedidemoR ages moomweaLe 20% r 


tee? otit stabiiat 62 sei %0 ot boas: at sted 
bos qa sedet ov £fby fodiem dose tahog, ed 
‘sieyfens toodixet”™ sat of darth eat: LLetob: nh. 
-EReooet ye bodtun eld? been poatyey soneetonte a 
veEgMOT OR ed ataet a} “gemen ;& sokxee +i 


cod Re dadt a]ebiencd antqges ihiovkl renee 


- L ‘ 


ef o¢ xrow a’cesy taxtt edd oetievyes pic. z: 


a 
& 
lated 
c 
PA 
@ 
ta 
ee. 
= 
oa) 
© 
is) 
2 
o 
oa 
cod 
Cs] 
77 
op 
“2 
he 
x 


eitlzuecq- sting abt eleciss said ‘enotnee | 
Listxse & Silom soledet. one gd gaan felretante 


— oh ap mo apart an a, 8 a9 Sh OER AD OCS 


eaaniool te tnegeieeek oA? oe ‘oat 
ivedes@=dtac® ,6£ .60 agatgondlt :; 
rn 


textbook might be somewhat different than that taught by 


another teacher using a different textbook. Under ihe 
circumstances, the work of the committee for the pre- 
paration of the sdavtnat vox will be rendered difficult. 

In the first step each textbook will have to be analyzed 
carefully to secure the important points it contains in 
Statement form. Next, the results from the various text- 
books will have to be consolidated after an adjustment has 
been made to guard against repetition and overlapping. 
After this, these items will be turned into test questions 
using whatever test technique (true-false, completion, 
multiple-choice, matching, etc.) is best suited to the 
individual item. Now, some provision must be made to make 
Sure that no individual phase of the subject matter is 
overemphasized. In most tests this would be taken care of 
by the "Table of Specifications" which will be described 
later. In a bookkeeping test, a check of the test material 
can be made against the accounting cycle since each textbook 
is supposed to seven the cycle. The test-makers should : 
‘also make sure in this checking process that ali steps in 
the cycle have been covered with completeness. If this 
whole procedure is used, the bookkeeping contest examina- 
tion should parallel all one-year bookkeeping courses in 


. the state. Professor Carlson then goes on to describe how 


weit tThorett£5 3 
ee reba .aoodimes ree tebe 2 golan a 
| er eit sok enadtt loomed sag to xTON eats, a 


Jig cthtts Serstiner ed tite noivanianrs odd baie. 


oh entataco ¢!t. etnieg Sratiogak ont eee: ‘ve 
atre? exoltet ous mort tne ee 
col smomtenth« co iets BesablfLoenaed ed igs .6 
imoteqesizevo’ Bns aoltttegey failags prays. ‘onl 

t Pr 

untivesup feet ofni benrsd ed Litwremert enous, at 38 
aoliolqnes , sel fatq~eunt) oop inivet vast rove < 

ai? of hoting feed. ot (vote aah oo ees | 92108 
an od sina ed tesa nolelvoug see ,FOR” 
sitsa Joetine edé toe esadq Lawhlvibat on 3 

yigd patat ef Sinow side’ efeer rsom a1 sane 
bedireneb ed LLEW dotde “enult get steege ro: ‘efdet’ 
rat osit bo wteoto 24,0807 satyoesiood = 


a a 


st dese sonts efere sattagocsa, edf Taniega- oban od 


4 
o 
4.7 
ut 
t 


binode grevac~tast est. “eicre edt Teveo ot 
mr eas 
hs Bo ee et Vane Sie coy < wi i Loans wks ni owe is 


A ater par 
idd *T .otenetéiqmes abiw: terevoo Gee 
anisexrs taboos: Erigourtood bad (bene ef em! 


if ReawWwoS SnigssMtodd sBeyrene is feiss 


he combines this method with the “pooled judgments” 
method before finally validating his tests. 

Before we attempt to evaluate this method, it must be 
remembered that it is a common one used. The main 
advantages that can be claimed for it are that it is simple 
of use, and tests thus constructed do tend to fit rather 
closely the actual teaching practice of the day. It is 
important to note that the textbook analysis method reduces 
fundamentally to the method of pooled judgments of competent 
persons, since each textbook's content represents the 
judgment of one or more Supposedly competent persons. 

Professors Ruch and Stoddard call attention to the 
following disadvantage: "A test which represents nothing 
more than a composite picture of the content of ten, 
fifteen, twenty or more representative textbooks ina 
given school subject cannot rise above the level of 
measuring what is actually taught in the rank and file of 
schools. Such a test fails to a degree, because it does 
not measure what ought to be Seer eia se this is true, the 
textbook analysis method would be fair for the majority of 
teachers who use the standard pedagogical methods but 
would be unfair for the small minority of "progressive" 


teachers who use the textbook as a guide rather than a 


(6) Ruch, G. M., and Stoddard, George D., Tests and 
Measurements in High School Instruction, pp. 505, 
World Book Co., 1927. 


lc amita iia Say 


ont exo 


* iy < - 
(t etavieve 7 tqaedée on Biot 


vad id: os 5 


+ } 3 pra 
ay I oc.. J itustitea iL <nhbots bane Aonk arceaetont: «ibs 
7 * ey 


— a ‘ + ow . - 
i > ” s = S ~ + 32 » Le Da 
ee ‘ ae 
’ 
Ta oli * 
1a - ‘ “oe eS. «ame 
- P > 6 : ss Tas a) 7 « wey ih 
a 2 > 
Saeed 
“is - 7 tt Le rear ‘ Arq a * 
as / - & 4 . ~ : - % 
‘ 
, : 
. a * 
a oo ¥ ¢ a - . 
4 - +& v & . i ae 
ns . 
é - 
- ° - 
2 : P- ite VES 
- r+ ? . - 
. . 2 - a ‘ F ve | 
. 
“ 5 ' 
3 . w ws iJ 
~ . 
. Pehl ae 
J as Wan to 
| } 7 - my or - < 
3s - we ee | * v ek 
. + 
; ai - - { et fig 2 s$ 6 zi 
) L A oV Gt 4 i . a 2 ~* 
> . - - i aod r > ~ 
*) ai 2 ~ ie - 2 = a ) én oa 
=) - ee de ee 


fine etaol , «G sgt6ev < sia bbod & irs > sit ae Monk. 
,20€ .a@ ,foltensveal Lloodet maakt Be AT DERE TES: - ; 
awnAr i, 

| VSL ..08. ahi Sci ‘ 


46. 


bible. In Conclusion, these same authors say, "This method 

is not to be advocated where better methods can be irda a” 
The next method to be considered is "validation by 

analysis of courses of study". As this is one of the minor 

methods, it will be treated very briefly. It is a variate 

of the textbook analysis method. It is not considered so 

good as the latter, however, because courses of study all 


too often include a multiplicity of detail about aims and 


objectives and merely scant outlines of the teaching 


material. Professors Ruch and Stoddard conclude that: 
"On the whole, the analysis of courses of study is inferior 
to the textbook analysis, due to the fact that courses of 
study in their usable (published) form are far less detailed 
than Dea 

The “analysis of examination questions” method, the 
next method to be explained is considered a variate of the 
two preceding Brtsead It has about the same limitations. 
This method has possibilities if it is used in conjunction 
with other methods. In this method, the test-maker gets 
in touch with leading teachers in a certain subject and 
asks them to submit copies of the final examinations they 
have used over a period of years. The test-maker then 


analyzes these and eliminates all repeat items. The 


oe eS 6 & oe Se ee ee Ge ee oe SE et CP 8 oe oe ee Se oe ee ee ae ee ee © Ge Se SS Ge Ge Se ee Se Se ow oe oe oF ee Se 


(7) Ibid, pp. 307. 
(8) Ibid, pp. 307. 


C z ' 
- ~ . 4 
~ : - 5 - 
; te » = 3 . : q 
t 4 — 
. . < '- , ’ 
a es . 
; 4 
- c . . 
. 
~ . 
. j 
‘ 
. 
. 
> 
. 
. 


remaining items are then summarized under appropriate 
headings covering the individual phases of the subject. 
After this, the work of converting the individual items into 
test questions is carried out. In concluding about this 
method, Ruch and Stoddard say, "The teacher's selections of 
questions for her final examination represent an additional, 
thoughtful culling out of the non-essential..........This 
selection tends toward added refinement over the textbook 
Or course-of-study analysis method. It shades over at the 
Same time to the method of judgments of competent pation s. 
The method par Seeatiunes today is the "pooled judg- 
ments of educational authorities", the next method to be 
taken upe All the methods taken up so far have really 
rested in their final analysis upon pooled judgments. We 
have noted previously how Professor Carlson made use of 
the textbook analysis method in validating his Series D, 
Bookkeeping Tests. Later, after the bookkeeping examination 
was extensively used, he obtained ratings on the examination 
from hundreds of bookkeeping teachers, and thousands of 
pupils who were examinees. These ratings were evaluated in 
Order to ascertain whether a unanimity of opinion had been 
expressed that the bookkeeping examination did cover 


thoroughly the contents of a one-year course in bookkeeping. 


(9) Ibid, pp. 308. 


This unanimity of opinion was found present in the ratings 
given Professor Carlson's tests, consequently they are said 
to possess high validity. 

The "pooled judgments" method is of great importance 
in connection with the sifting of the original lot of 
tentative test items with a view to the elimination of the 
least valid materials. “Experience has shown that the 
average or median judgment of a group of from three to 
ten careful judges is certain to be superior to the opinion 
of a single worker in approximating the true worth and 
difficulties of proposed test Kunde: The valid examina- 
tion should parallel the flow of actual teaching and should 
represent an extensive sampling of the materials of 
instruction. The pooled judgment of competent persons is 
@ good index as to whether or not this standard has been 
attained. In conclusion, Ruch and Stoddard say, "The 
method of pooled judgments, alone or in combination with 
Other methods, is by far the most common validation practice 
in educational test construction today. Unfortunately, 
the judgments very often represent the opinions of but 
two or more persons, and are not checked up against exper- 
imental eC Lt 


The use of the correlation method makes it possible 


(10) Ibid, pp. 310. 
(11) Ibid, pp. 312. 


7 mal > 7 
Pr tee > » ) ; Pa) UY IP 
: i Lae A ‘ 
- = gil ale ve , “ves hi: ra 
. he : ooh é 
oe 
° ih 7 
& a at; 
; _ ? - we 7 r 2 
 < 
Pe; Py se ¥ 


af 


os netted odd ai Jeeaete> taivh ae ‘adtalda' So 
Biko of# vedi yitnespesned ,2itee 6 no@h tad nownet 
z ystet tay Anu 


Sanat .ogai feswa lo eb. foddear “ay weeny Bt: beso” 


- 


io vol fesistyo edd Ro mebehie Bar deo QW ies 


4 


i; “ogo oobeehlatts pod oF welv & Bei aeerl faee en 


Si ST ae 
7 i 


ne A eh 
‘ps 

* 

+ 
a 

‘ 
rt 

> 
»! 
he 
| 
cq, 
c> 
r 

te 


ak ent net eoce vege” - elabzcem otis 
to Iftseyiol nelpemt o 


> 
& 
: 
| 
‘= 
wat 
og 
pe 
&, 
© 
<4 
3 
& 
my 
bd 
G 
m. 
~~ 
® 
oO 
a2) 
or 
& 
er 
ea 
(- 
= 


oa 
< 


ee | he 
ee ge | 
Seb 
* 
" 
of 


as’ a 
be 


eo > Sa 
oP! ro, _ 
iy  “ ecfest t9e% Seeger, by Bi 


i 
3 
45 
be 
oe 
Lag) 
ul 
Pa 


io efetxates ent 6 antiques evienpdse: ne 


susie Cneteqtot fe. Saemnist heloog ont sot 


i) op ARTO 


SL, eo ee 


\¢ eat Erebasts etd? gon so tettvenw GF aa xebat f 
oft ™ vne DYBBRCTE. ba8 cog dolgs Domoe Ht 


ftiw scitenicbse al to efeis , onamyhat Seiccg Fo mae 
« - cae 


’ : 


J _* , ee yy 
; ‘ I >. sn 
Testa noktepiiay nomi: Suen eft 46% Ve ee Eiods em 5 ¢ 


fetantd yOiay «shor aettarat > fost sonohtnenb 


asigtaiqo sat. tiessige tT eda cia as Lontyy bs 


rv Ve oS 


“T5SqxXx68 iiiens qn Bejfooio, Ton. ets Bae Essay ncnt0 


sidieaoq 21 estan tosntem moive letede eas 
—— ew eo ee ee ee ee eee a a a a a eat ty nt ae a te ie ; 


é 


‘ae 
or 
eo | 
mck & 
te he ” a 
gf 44, ae 
+f ~ i 24 . 
at ‘ Nat rh 
fo sl ie 
shay ve 
Pe Sen tile aS 
" ¢ ar) cy, — 
S7 A. hf = a ~ @ A a . - 


to determine the degree of validity of a test in numerical 
terms. This brings us to the next method which is “correla- 
tions with school marks". It is very common practice to 
use school marks in the validation of test materials since 
these marks are easily obtainable. This method is of no 
little value to test-makers. The mathematics of statisti- 
Cal correlation will be reserved for a later topic in this 
thesis and will there be explained in detail. In general, 
the usual procedure is to correlate test scores and marks 
by the Pearson product-moment formula and then make 
additional statistical corrections in order to produce a 
greater degree of refinement. Ruch and Stoddard say, 

"Such correlations are never high, 0.85 being about the 
highest which the writers have ever seen reported in the 
literature for single classes. ‘The reason for such low 
correlations with school marks is to be found in the low 
reliability of the marks." 

"Validation through percentage increases in successes" 
is the next method to be examined. This method is of great 
importance in reference to the drafting of standard tests 
for grammar school use. The first step consists of drawing 
up an experimental edition of the test and administering 


it to hundreds of pupils in different school grades. The 


(ie). Ibid, pp. 318. 


aed tw! |s 

: ‘ewe it 
Bae 
SE on ' 


tLe? Se perned s4¢% entererenh ae 
anita shi? »4 emer 
. oisem Icodos Adiw eneke 


siten Loompes eas 


t Aad 


tervo ~lloes s27e. eticee seeart 
.stolem-tees Of ep fae eiraeen 
ser od fliw netvelersed ieee 
sosi? (ile (a8 ebgeds 

st smieveng fenan ong 

ong ccoaneet ads vd 

oe fad tate Isnotslobea 
seamesnitet Fa eetaesd a6feeum. 
1oven ete snoiiglerTioo eee 
otiuw of? doidw teenage 


7 


, asin ¢@ ie tot sutveseg ee 

loochoe. Ariw van bipieszes 

* nixen eat to Gtiligebeas 
rowdy aetvebi Lay” 

‘mays ec ? .bedtem tan end es 

oiex of sonsdxoegal 

ocd Ieodou temmstg Bez. 

4: DOG TE GES Age 


ro ebhestband oF FL 


= pba aed > an —— Se ee ee ee ee ee ee 


2,6Ls tis , bid {8f£) 


pupils are then grouped according to either age or grade 
level, and the percentage of pupils passing each item is 
then computed. If the percentage of successes rises 
Sharply and uniformly from one grade to the next, the 
item is judged to be valid and reliable because it dis- 
criminates between different levels of ability. If the 
rise of the percentage of successes is erratic however, 
the item is invalid and should be eliminated. "In 
Summary, items with '"throwbacks"' are a source of great 
unreliability. Items passed by O per cent or 100 per cent 
are functionless but do not cause unreliability. They are 
to be looked upon as '"dead timber"™'. The Sharper the 
rise, the greater the reliability of the vente 

Even a casual examination of this method will show 
that it is unsuited to most high school subjects, particular- 
ly those in the commercial curriculum. Ruch and Stoddard 
point out that: “It is limited in its utility only in two 
directions; viz., (a) in a few physiological capacities 
Which do not continue to develope over a period of years, 
and (b) in those school subjects which are discontinuous 
Over a@ series of ss cnasl tains social-business subjects 
would be subject to the second limitation as they are 


mainly grouped in grades 11 and 12 of the high school. 


See SS 2S SF Se SKS SSeS Se SS SSK Fe SS SS SS SSO SS SS SOS SS SSeS SS Se Se SSE SSS SS SS Se 


(13) Ibid, pp. 322. 
(14) Ibid, pp. 319. 


Gish tO ages Pottie os gaswenee Saigo ak 

al met? Heese anteedée sliase Fe nearest ee a 

egnit spapaoove to sqatneoreg add. Vs ae abot 

odd , teen edd of abate ono MOR) smsenas oon 
-sfih.f2 oeneced ol¢elies- sae. bi La od oF. 

cs? sweeitisé ts efaves snesekees mnie a 

.Tovewed oftaute of eocsechee To-egsineoteg ast 

at" .betentat fe od Sipe soe bitsvab eta 

taovTs. 6 gous a ove "sdosdeocde®? ntfs anott ¥ Smt 
tnueo weg 00f vo inpo seq 6 yd besaag ametr” ow it ds x 


> ete YedT. shildsilerna eevae tom ef tee rolgos 


seqzanke oft Sgn it baobe? ew negy. 
i’ : “i 
Pa 


*coti eit be yitsecst len edd x ola 
ywede iikw sieiton efits te colt Re tee fan880 a. mt 
-yeisolttsae.,etestdre teense dgld teon of Kei Inns att iz. x 
Ayeleere ine kon. .mélneninas isfoxemmoo eft ales 
ewd ai vino qedidit® etl ML Seriatl el 22" seeds 0 


, 


ih . seltfoeuse faotnolotagdg wet e nt isk: ( Lkewos 
<aey 9 lotteq « revo edoleves sc ean lTnow rom ok 


esxosfiignoos?ih eve colinw stoutsse fodder euodd at ( 
| (ae) 
siooidus caetiegd=ielcos ont * apteastg io @ a 
" ’ 


Bere 
ees ats yeis wea ogise Sheil bagoge: ead) oF seotins a Bi 


4 
Loses doit ett to Si ba6 Ef sebere of 
i fh oe crkeheanche aeids dukein Gad iced cele ta 


The "validation by the principle of social utility" 
method is limited in usefulness. In short, this method 
presupposes a checking of the test in the light of the 
social significance of the information called for in it. 

In other words, the test should include to a great extent 
Only knowledge and information that will actually function 
in the lives of the individual pupils. In the construction 
of a curriculum, the determination of the social useful- 
ness of educational content is a weighty problem. In fact, 
at the present time the content of many commercial courses 
is being attacked on the ground of social utility. The 
Opponents of these courses claim that the instructional 
content is ill-adjusted to meet the needs of present-day 
youth. It is patent; then, that this problem of social 
utility for tests must ve considered. 

It is not possible to validate all subjects by this 
method. "In a few subjects, chiefly elementary, important 
experimentation has been done in this direction. Extensive 
counts have been made by Thorndike and Horn on vocabular- 
ies; by Horn and Ashbaugh on spellings used in business; 
and by Wilson and others on the arithmetic of oe | 
This method has not been of great importance up to the 
present in the construction of tests and examinations for 


the social-business studies. 


(15). Ibid, pp. 525. 


pay ives om 


rsifitu Latoos YO eigqtenize pag ~E. sikeeatiiala 
youitem sine j,tRode at -avonigtess ni bet lok ab 
it.to Jigis eho ci Saez edt to aniiveds & 
i af vet belles Ssaxsotal edtsite condos hints : | 
Passa ¢ cv epelond bisode seet ead) .ebtem 
one pt Des tos ifiw ted? soltemnetnd Sis egbed 
aitence eit nl .6liqng. Laebiyian eit To sovll @ 
tuts fetoon gut. to nat taderee teu eat: yams pe 
at ,meidet¢ yiwtale@ a al Peesnse isnoiseonte 4 


rassexg to efsen ed team of Serasieeenel E tn 


of ¢e sonattogn! 


~ettiits feieos, bo basow edi ae asener er 
fouvtent ety tedt mieis sepawan’ aged? to. 


ald aa ‘et Poe > 
ae ee) 1 ay fe 
r . ; 4 - 
ae Pe Pt 
} ye is : = 
} e ut ae 
s) :3 
eee 


oigmmes vosn to toeénes eae eek teonert. 


‘= 
) 


to agidervy aldt teat ,pfedd,fdoedee at oe 
»pexshiagon ed seu atse? tet es 
tool dae. [fe e84 abRian, oF oldieeon fon el seal 
retasmefes ~iteise ,etestiga, #67 & ol” oe 
.noiicet?b eidd nb edb geod asl abi zadaeee 


7 mo neoH bre silsaced? yo ebez agpe e7ad » sa 
we (ie, 
ega m0 dnneddea bus axok we ae 
107" eo 
: ¥ 

iipakit ia edd no sredte his soeiie 


eee: 
to pest tem een hodt en pin 


"3 


» 
2 
~~ 

* 
te] 


fs 
Ns) 
& 
@ 
a 


tenimaxe Saez etset to solifeuttencoe ede af} 


Sof bide ougnt ead=lel008 8 


i — ee ee ee eee em a Se ee ete en Aen eee tome ome 


In order to complete our list of validation methods 
in the field of educational tests and examinations, it is 
necessary to include some explanation of the method for the 
Validation of individual test items. Professors Ruch and 
Stoddard did not include this method in their list, but it 
has been used extensively in recent years mainly in com- 
bination with other methods. “The assumption is that if 
each item of the test is valid, the entire test - the Sum 
of valid parts - possesses a high degree of re 
This technique was used as one of the steps in validating the 
Peters, Greiner and Green Commercial Law Peck In this 
process, the test was administered in the Commercial Law 
Glasses of a number of high schools and several hundred 
papers were obtained. After being corrected carefully, 
these papers were arranged in order from the highest to 
the lowest scores. Ten per cent of the best papers were 
separated from the pile and, then, ten per cent of the 
poorest papers. Every answer of the ten per cent best 
papers was compared with the same answer in the ten per 
cent poorest papers. If the pupils in the best papers 
did not clearly demonstrate their superiority in know- 
ledge with a given item, then the item was weeded out. 
It is obvious that if the "poorer" pupils answered an 
(16) Carlson, Paul A., The Measurement of Business 

Education, pp. 10, South-Western Publishing Co., 1932. 
(17) Published by South-Western Publishing Co. 


“>.” © 


ae eT 


ee A ee ll ll le el i A tt A a ll re a lal ee a ee ee ee ee ee 


| an ee Dee Ne a 
\ g,F haa 7 hy ot a 


ebodtenm goitshlisy le Pe Sees sTRignes: os 
tl <wnetiacimexe fie eeee? LARC LASERS: to. Diets 
* Bodden ea? He noitededene ‘patee, ‘obo Lome ot t 7 
$f aroeseicws wati Feed Lan pevibad te nonsl 
habs nif giend ni. Soatemeeiae seslops 70h oth o 
noo Wi 2 - Breet vebset of canine eee 
"! tea? GL Robi guises eat” sehoitem todto. atte e 
© 64 - teed enti ne sie. -ObpaY si ¢get edit 20 6 
ea ihifer to esigeh dgid.2 serasegeg. > bah 
unlsebilay Nive yore edd to ene ee bear ven enpd dS b6 
sigs 23 .adee? sed Intoiremmd neetw bie 
al {glotemed edd th Deasteliiate eae. tees edt 88 
syhacd Latewee bae efcedes dnda te Sete oe : 
listesse> bet osteo shied tatte iBonleedo oxewl Q 
t teeduid ede mort ~oba0 mh Seguetia exer sxogeq. Seeds 
sraw eremed taed sad: fo. Tago. Teg net -GaT6o8 me | 
sit lo tas0 weqliied” jfedt- bow Liq eae meas beter 
gasd toes tseq not eft fo téWene vrdve s8tege@) te 
ssa cof at cdl cesente omec eho 92h ye teqmos: SRW 3 is 
ei1egqeq @s6d env af eilcgg eng eI exeqsq: saezo04 3m00 
-xanv of ytitoixaqis, tled? etextanares gimseic Fon bis 5 
so hebescw caw aebi eat seeds. megs ceva 2 dete Bae. : 


ty 


te berowety ibang “vartag” aid &4 todd, snoinde 8 


stonlend to. ¢fteaetadeed @ny ;-6 ine + 
OL, oe 4 sideiidat nreteeR-ddaet jor AE A 


.o0 geidetldst pretage-ddner wet 


item correctly more times-or the same number of times as 

the "better" pupils, something was inherently wrong with 

the item. "All of the items retained discriminated between 

good and poor pupils and were therefore considered valid 
(1e) 

items." 

Professor Ruch calls this method "an experimental 
method of validating individual test items". He labels it 
such in order to bring out that this method makes possible 
@ greater degree of refinement than any of the methods 
involving "pooled judgments". He next gives a series of 
seven steps to be followed, viz.: 


"1. Make up the test items, arranging them by in- 
Spection in order of difficulty. 


2e Give the test to the class, allowing time for 
all to attempt every item. 


Se Score the papers. 
4. Arrange the papers in order of size of scores. 


5. Find the median mark and separate the papers 
into two classes: (a) those above the median, 
and (b) those below the median. Call the first 
group the '"good"' pupils and the second group 
the 'poor"' pupils. 


6. ‘Yabulate the number of pupils passing (or fail- 
ing) each individual test item, keeping separate 
tabulations for the '"good"' and '"poor"' pupils. 
Express the passes (or failures) in per cents. 


7. Study the per cents for '"good"' and '"poor"' 
groups. Reject items where the '"poor"' group 
shows percentages of successes as high as or 
higher than the '"good"' group. Such items do 
not differentiate abilities. The best items will 
show the largest differences in successes in 
favor of the '"good"' group."(19) 


ee ee er we em we ee ae 8 ee ee ee ee ee ee ee ee ee ee ee Se Ge ee me me Se me we ee ee 


Carlson, Paul A., The Measurement of Business 
Education, pp. 10, South-Western Publishing Co., 1932. 
Ruch, G. M., The Objective or New-Type Examination, 
‘pp. 57, Seott, Foresman and Co., 1929. 


Ruch illustrates the method by means of an example. 
The illustration he uses is worthy of careful study. 


Table 6 - PER CENTS OF "GOOD" AND "POOR" PUPILS ANSWERING 
INDIVIDUAL ITEMS OF A TEST 


Per Cent of Correct Answers 


Item "Good" Group "Poor" Group Both Groups 


l. 14 14 14 
Be 21 * 14 
on 0 6 ) 
4. 84 16 50 
5e 53 49 51 
6-6 LOO 98 99 
7. 0 0 0 
8. 100 100 100 
9. ) 8 4 
10. 50 50 50 


An analysis of this table reveals the following 
facts: Item 1 does not properly differentiate between 
high abilities and low abilities and consequently might 
be replaced to advantage by an item showing greater 
differentiation. 

Item 2 is greatly superior to item 1. The retention of 
items like number 2 will result in greater validity in 

the test than item l. 

Item 3 should be discarded. 

Item 4 is a good one. There is sharp discrimination be- 
tween good and poor pupils. 

Item 5 does not hurt the test but it is distinctly inferior 
to item 4. If possible it should be replaced with an item 
like number 4. 


Item 6 is very easy for both groups of pupils. A few items 


%O snheem td Box tom ents, eo sewtar £ 


ae 


—ybote Istesen to qttor el eean ot: nots sie 


CLIGCEVE4A ELICES "ORE CLA neneg” "=o eveao moe 4 
Voge £ Ae Sees LAvc ead 


G2taneia 0 ¢s G0 r¢ 3me9 10% 


quont “oot guint Fa0en" 


y odt eLeovet efter. ebad % ee 


> 
“<4 
@ 
+ 
PA 
m 
re 
oo 
c4 
re 
ss 
~ 
a 


-of. coiteninicoets: queie ed ‘cane +989 6008 ee 
TOltsicl YLtonivere ab ai tad ceed gag ‘Sau. don es 
rz See 

ves h se fiie fevatgent: 6d b Ipoh S- vi ote tsmom, tt Vee es 


; _ ae « 
a ae ee 
é ’ ‘oe 
f é ¢ 


, vate) © ale 
bet i #82 2 selinad 


like this should be retained and put at the beginning of 

the test in order to create a sense of "rapport" with the 
pupils. 

Item 7 should be thrown away except for a few to be inserted 
in the test to prevent perfect scores. 

Item 8 is similar to item 6. 

Item 9 should be eliminated. 

‘Item 10 probably should be eliminated because it does not 
differentiate between abilities of the higher and lower 
groups. 

Certain of the validation methods that have been 
described in detail so far are applicable only to the 
construction of standard tests. Professor Ruch has a 
number of suggestions in regard to validating tests having 
in mind particularly teacher-built objective tests. He 
proposes the following: 

"1. In the course of regular teaching, make a 
ractice of jotting down good test items 
questions) as they occur to you. 

&e Place these test items on small bits of paper; 
5x5 library cards are best. Make a file of 
these questions. Secure a filing case and keep 
THESE CALAScecccccccccccses 

Se When the time comes to build an examination, 
draw up a Table of Specifications. This will 
tend to guarantee a defensible balance of 


emphasis, freedom from non-essentials, and the 
inclusion of all _important topics. 


~ ft 

a 1 
2 
7 . ieee 7 
i 
¢ 
M _% 
+ : 


ore 
' 4 
' t 


-—~ 


4. After the test is_-given, ask the pupils to 
suggest items that were ambiguous, misleading 
or not understood. These shonld be either 
revised or discarded. 

5. Where possible, try to have one or two other 
teachers criticize your test items and rate 
them for difficulty. 

6. The validity of a test is raised by having 
the items of a proper degree of difficulty. 
Items passed by every child or failed by all 
contribute nothing to the test. cccccceereee 

7. The validity of a test is increased by having 
the easiest items first and the hardest ones 
last." (20) 

These suggestions can be carried out by the class= 
room teacher with a minimum of trouble. If this advice is 
adhered to, the teacher will be rewarded by obtaining a 
much higher degree of validity in his tests. 

Reliability 

The second criteria of a good examination is the 

reliability it possesses as determined by statistical 

(1) 
computation. “Reliability is synonymous with accuracy." 
A standard objective test does not measure up to the 
demands made upon it unless it is what it purports to be, 
viz., a scientific measure of pupil achievement in the 
individual subject or subjects for which it was intended. 
Just as a construction engineer would soon discard a faulty 


surveying instrument because of its inaccuracy, so too the 


i i De A a re De ee ee ee ee 


Ca0)...3uid, pp. 31. 


(1) Odell, C. W., Educational Measurement in High School, 
pp. 58, The Century Co., 1930. 


ee 
7 . 
ita 
> 
a 
RELVES 
ce 
ce 
-" 
~ 
‘ 
. 
: 
* 
: £ 
J is 4 
- E%F 


~~ a we ee eee 


fooHos sist? 


Dteacer: 


sogpicas svet tad? suet teesgie 


ee eee tet Nea ae ne a ae ae 


ol tagwetnase fencitequbt (28 aki 


fap. .nevig. ef deer. “as ere: 


saeit ,hooteteiap on 10" 
~SobTeesl > xe beslivet 


e ovat of yar ,sidiadeq ateae 
i is 3 LOY extolt ho stédoae? 
Utigelitt® 10%.med? 
pat DOL £21 i see? 3a Xe eth Law ont 
ot o vetneb seqeag @ to emeet eoy 
belie. +o bLiigs yYTeve ye beseag Baers 
saner yl Oe? OA2 OF gaosevOR ste risnee 
d Berse ton! ef tear \ 26 vililiey ont 
Fe bts ¢ bas teil? smetl seelnse one 7 
{O8) t. fase 
; “ 1o. bebtre ag sae olinengoa. af 
‘ 6idsort te moueieia «# Agi ede 
O ‘} c<HTLWe * od Li ‘é ser oger eng oF I . 
: st ald al-ytlatiev te semget cs 
F exe boon @ to aited?te sie dems at ard ad 
he! 
et 
Mepigats. ve jiareteb sa ecxtouseg vf “a 
£2 g i we BSH Ey ve el x - ti stg fen” 
ov an Canespen J 240% Feet svitostdo tise 
xoctag ti tetew «3 if gaping a nogs ebem aba 
Saevelavs Lletq to emwesem olthine Lee rae he. 
sev FI deter 42. atoetdse,t10e adh Dhak teubly: 


obe Elven tesni ges: no iv oaeteBeo P 


Se lies 
1: ,Cesuioosnt eti° to seuaken tisentiont gate 


@ 
* 


SEL 80 yursned ext Fé 


57 


modern teacher must be ready to disregard the published 
test that does not possess a satisfactory degree of 
reliability. If two thermometers were placed outdoors 
and were exposed to practically identical conditions of 
Sunlight and wind velocity, it would be normal to expect 
them to register the same temperature. If a reading of 
one showed 70 degrees F. and the other 80 degrees F., the 
examiner would be in a quandary as to which to believe. 
Obviously, one or both would be in error, and a further 
checkup would have to be made to determine which was 
correct. In a like manner, two forms of a standardized 
test which cover the same ability should yield approxi- 
mately the same distribution of scores if administered to 
the same group on successive days under identical testing 
conditions. 

"A test should measure accurately and consistently 
Whatever it attempts to measure. The degree to which it 
does this is called its Ptachee.gem es insure the 
7 proper degree of reliavility, the test must measure accur- 
ately and consistently what it does measure; to insure 
validity, it must measure what it is intended to measure. 


Ruch says, “Reliability is second only to validity as a 


(2) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 51, Houghton-iMifflin Co., 1930. 


ecar ec. eo Dat a A” ae ere — J 
> mee Seek 
“ > i oe 7 _ i 
_— - > a 7 < 
i 2 ¢, e - 
® 2 as ae 
+ 7 Dy id 
NM ey -, Ss » 
2° 


Jeijvoy eat Bragetelb of ybeer ee dasa “yeitoa 
. 
; iG SeiTRSs o2etetves a. TReeLOg ton ‘e908 te 


Pa: 


etoniinve Becale sxe sist snowed ye Owe a er 


. ma ot Spero 6d -hitowwe ttoorte®: saiw stlat di ee 

| ite1 s £12 .etedereqaet seat ene pea — it 

| ¥ <vab Of sente aod sam: .Soeeuges: oT 5 De 

; [ | tyidw od ae Tas pase 2 nl od biwos 

~“st 2 fig , touve mt-ee $isow Bred ae 80 4) 

cimyateh of ates 6¢ 6F evi ninebes 

mete 2» To. sect ovf , teansae éull ae: 

‘Sigg? Siely Sieads yeilids emee eh? serve ioe 

box niebe t% sasvhbee te nelendégeaks sass ar Be 

{feed Leoltsebl teias sy25 eviasecges a6 qvots em 
. ane 

vwitociefenos tna ¢fetaaose siwasen Bigede tat. 

shops of estyeb 642 “semwenes oF. Giaqtetia haa 


a 


fat ett Bsitee. ef 


qiiida! 
psa © sem thom that edd jer Di icer ies ta. coxgeb a 
exsanl o¢ ;e1Hyeebe Soest Il sie gisnodstenba ao 
i : son < 
ijiszeem o Peihaiatat & if Fea Bus asew tem a F.OLIBY. 
Sa > ae wy 7 
ifhilev of ylso fecoee et git Li ct WA ig Se 
oitias 


; ot ais inigzs to teex 3 20 Aésow edb aot 


~ - or om: a ton Cee et Oe ee le mee et 


imext ieftith al ebtedtéM ntetel per 
Oe oOo alitinienbtughel ja sg 


say that the second most important fact which we can know 
(3) 
about a test is the reliability which it possesses." 


Reliability is a much more restricted term than validity. 
According to Ruch, reliability is one aspect of il tees 
Professor Odell expresses the same idea thus: "If a test 
is not reliable, it cannot be valid, since if a test does 
not measure whatever it measures accurately, it cannot 
measure the thing it is supposed to measure day brooks 
is possible for a test to be highly reliable, however, and 
yet have little validity. For example, suppose that a test 
intended to measure general knowledge of commercial 
geography required the reading of long or difficult exer- 
cises or questions. The test might be highly reliable, but 
might measure reading ability rather than geographical 
knowledge. 

The reliability of a test cannot be determined by an 
examination of the test itself, although various inferences 
about it can be drawn. It is the test-maker's duty to 
perfect his instrument so that the desired degree of 
reliability is obtained. The test-user will usually find 
information about the test reliability in the manual or 


elsewhere, yet some testsare published without this informa- 


tion. If this information is not forthcoming, the 


(3) Ruch, G. M., The Objective or New-Type Examination, 
pp. 40, Scott, Foresman and Co., 1929. 


(4) Ibid, pp. 41. 


(5) Odell, C. W., Educational Measurement in High School, 
pp. 59, The Century Co., 1950. 


Lee: 


infseraca TL soige Vittsetiey eat ald 


i 


ternen edt at ¢¢titdelien geet of? sade sett an 


——— ee) “s = + spon ren ripen a eo 


a ean « % : - 
|} hae execs A om p 
905 ty le aa See ’ 

Po di mt a : 

: "\ eo ed is 

is ) 

© a " oP sl, #1. 
ia. ee 
i , “i 

en 


ew icihw toet gnatsegm- ‘tpom anoves ¢ 


neg? g2et feteixteed and oun 2 et. 
Seegea ono ef qticdel tex some ea 
-egid weht omse eat anaserere ri ded t 

» t© opsie ,bi fay 6d tonne th seidalte 7 
ji , ¢loveuts0s Be 1spase al ae 
axe sivtiaem ov. sse0gqia at <4, ontat. 
ca ,ei¢aiies yideie ed af see @ 19% © 
ecoqgus ,sianmsxe got atl bt iey esstts ne 
mecmoo ko eibelmeawt Laxensy: ervegen ot 6 
Thib +o Booed To satbees end pealnpex % a 
fot _iinid ed tiple tent ast) venettecsny 736 a 
‘ogevsesg stadt tedtes Witids ghibest emecoms A 


beniaxatebh ed sennas deed @ To yititdstior e 
eyoinay dasedtiA ,Tlesti tees_ed?d To olds rT 
f e'xetem~¢aet ode et PT. .ntath oe Bo " 
astuef bexteed ets gett os Taeg0ts ead etd 2 
wen. flfiw tenu~seed edt wbentatée el cite 


tnottia pedelteng exsetest, eape, sod yam / 


sshlimotas+es tea e4, nelventetsd est E 


mex oayt=wet. to ovites te eae , cious 
sCSRL . cov. bus Reese ito: aa 


»O60L sad) 


cl ccemennesed Lane logenbe- ss 


assumption is thet the test is new and its reliability has 
Gas been computed, or that the test was not constructed by 
an expert. The absence of this necessary information 
tinges the examination with doubt as to its efficacy for 
testing purposes. Odell says, "It has sometimes been sug- 
gested that .90 be accepted as a standard for the coefficient 
of reliability which all tests should attain. No such 
exact critical point can be justified, but it is probable 
that within a few years a majority of the tests receiving 
wide use will have reliability at least this Bi We a is 
doubtful if any blanket statement can be made about the 
requisite degree of reliability for tests because this 
varies with the function that the test is to perform. 
"Common sense will tell something about the probable 
reliability of a test." For example, suppose some broad 
achievement like knowledge of the use of notes and drafts 
in banking is being tested in commercial law. It is 
obvious that a five-minutes test consisting of merely two 
questions would be a very inadequate sampling of the pupils’ 
knowledge of this subject. The results would be unreliable 
and the entire procedure manifestiy unfair if used as the 
sole basis for determining the pupil achievement on this 


unit of work. Such broad achievement cannot be measured 


oS ee me we ee we So oe ow OP Se oe oe 8m a 8 O@ Ow om GS = oe SS oe OE ee ee ee eS ee ee ee Se ee De oe oe ee oe oe oe 


(6) Ibid, pp. 66. 


(7) Ruch, G. M., and Stoddard, George D., Tests and 
Measurements in High School Instruction, pp. 54, 
World Book Co., 1927. 


LG 20 He os. od? 4 


: — - : ra aaor 
* " < 1) sh) oP neivel 


e sidg¢iw tags 
: = An re . 
: Eau > 


ifilw eas 

idaiieu-o setseb etletapes 
ah an i ae 

wit notions? edd Agl 


it’ JI, 1%. 68nhes ogg 


a tf ",Je009 @ Io 
; Aes yo > ee! 
E : ed el 
; ‘o~8e JHE : 
qx 
m4 > t 
ov ¢ ot binow enelieoup 
. , hy 
sw h5so0tg & 
uUtdiazxeses + 
3 4 ; 5200 Bao 
- o ~ a nt ee ee ete R Oe ee 


: ‘ww ra) 7% 


a> el a . ro : et tar gaze Sri. ee ie} 
26 .om ,nolten taal fotsot. tae at etnqus s 
-VSCL <0 

~ Mans = P 


4 7 a in) 
s z 
sd 


accurately in five or ten minutes. Whether or not the 
reliability of a test is to be considered satisfactory 
depends largely on the purpose for which the results are 
to be employed. "Many tests that do not yield individual 
scores reliable enough to be trusted, do give fairly 
accurate scores for groups of pupils of ordinary class 
Size or larger. In other words they are reliable enough 
for use in judging the work of a class or of its teacher, 


(8) 
but not for that of individual pupils.” We can see then 


that a survey test for measuring an entire school or school 


system can yield reliable measures with a small number of 
items (10 to 50, perhaps) since it is dealing with the 
average scores of a large number of pupils. Yet, if any 
individual pupil's score were taken at its face value it 
might be entirely unreliable. High reliability is more 
important for diagnostic use than for general survey 
purposes. 

Principal Methods for Determining Vest Reliability 

Reliability is more of a statistical concept than 

validity. "Reliability is most often stated in terms of 
reliability coefficients; i.e., correlations between the 
scores earned by a group of pupils on two equivalent forms 


(9) 
of a test." It devolves upon the test expert to make the 


(8) Odell, C. W., Educational Measurement in High School 
ppe 60, The Century Co., 1930. 


(9) Ruch, G. M., and Stoddard, George D., ests and 
Measurements in High School Instruction, pp. 52, 
World Book Co., 1927. 


- 


ec105a ensTevs 


_GSSTLOCS 

ldsitet 
rsl ebase@esb 
yoigqme ed Ci 


ASIST FS TOOSs 


+H 
« 
| 
2 
Co, 


o 
Su 
< 
b> 
8 
~ 


+; _ 
vi A oh 

Pom “Fr aw f 
w ie SSS sees S 


GLOon. ss 
ral is 
~ ” 
sTTLIDLLSY 
a 


Pe 


@ BeTGCe 


-— at a ee ee ee ee Oe 


_ 


necessary mathematical computations as the classroom 
teacher ordinarily would not be conversant with the methods 
used. In general, the three most common measures of 
correlation used, according to Odell, are the product- 
moment method and two varieties of rank correlation. It 
mystifies the ordinary teacher when the coefficient of 
correlation obtained by the rank method does not jide with 
that resulting from the Pearson's product-moment formula. 
Statistical experts have figured out methods, though, for 
cgonverting the coefficients of correlation obtained by the 
different methods to an identical basis. Odell says, 
"Although in a general sense any measure of correlation may 
properly by called a coefficient, the expression "coefficient 
of correlation", abbreviated (r), is conventionally limited 
to the product-moment formula. It ranges in value from 
71.00 down through zero to -1.00.” 

Ruch suggests three common methods of finding reliability 
coefficients, viz.: 

"1. By correlation of the scores from duplicate or 
equivalent examinations administered to the same 
pupils. This is ordinarily the most accurate 
and defensible method. 

2 By splitting the results from a single examina- 
tion into chance halves, correlating the half- 
scores, and '"stepping up"' the resulting 
coefficient of correlation by means of the 


Spearman-Brown propheey formula. 


(10) Odell, C. W., Educational Measurement in High School 
pp. 580, The Century Co., 1930. 


5. By repeating the same test or examination after 
an interval and correlating the results. This 
is often called the '"retesting coefficient of 
reliability"'. This method should never be 
employed when the first or second methods are 
possible." (11) 

Theoretically, the first method is the best one to use 
according to Professor Ae sae method cannot always 
be used however because many tests do not have duplicate 
or equivalent forms. In many cases involving tests in the 
social-business studies, recourse must be had to the 
second method. In using this procedure with single form 
tests all the odd numbered items are considered one test 
and all the even numbered items as a second test. The 
coefficient of correlation between the scores of many 
pupils on the odd numbered items and of the same pupils 
on the even numbered items is then computed. The result- 
ing coefficient should then be corrected by the use of the 
Spearman Prophecy Formula. Carlson reports that: "This 
second method (Chance-Halves or Odds-Evens Method) was the 
method used in determining the coefficient of reliability 
of each of the Carlson Bookkeeping Tests and each of the 
Peters, Greiner, and Green Commercial Law eres all 
these computations it is important that the scores from 


several hundred pupils be used for each calculation. 


As has saad been pointed out, the third method is 


(11) Ruch, G. M., the Objective or New-Type Examination, 
pp. 415, Scott, Foresman and Co., 1929. 


(12) Carlson, Paul A., The Measurement of Business 
Education, pp. 11, South-Western Publishing Co., 1932. 


a5). Ibid; pp. 12. 


s~peRelt ent 


»y eo 


b y's 
Ste) he en ee 
sy oo 
aes 


S30 6h nity ge Qos ya o& , ‘ i 
ST Thon S28 Levretve. os 

pad feitaos getke ek 

tit <7 Pet Witdgetie® 
eij netie Segelame 


wt) “Se itlegog 
and ., Yi lLeotieszoadt 
(ra2D «g¢eetert of paleteges 
cenBoed sevewon beam ad 
.eauot- yseEevispe 26 
sox Bel bite sanindd-falses 
(at 8d | .bedtem baooes 


cnn B60 edt Iis’ataer 


be r6dmmua seve ced’ Lis dite 


iclialexsoo 30 taoeisityees 


cs 
au 


foxrstmn Bao ef? co eiiane 
exevti Setecmin AByYS sae ea6 


oivode ieitotttees sar 


~ 


cedget? csemTeede 
nen!) hbedvem baoosse 


acé6éh al bean bontea 


ia celys) edd to. dose Ze 


yourmo0 f one ,cectexrd ,erstag 
‘ai si ch exol¢aduqmes seemt” 
beet ed eliasig Sbexrbasd teteves 


sicg need ybsetia Bad Oh 


<n ee ee + en ee ee Re oe ee ee ee ee nee ee ee 


ctd0 odd [iM 5h. og eee 


i <« 


5 
ag: , not tsonad 


< ‘- 
ic lar de 1% © 


nad , noel Shy. 


rs 


‘ 
nd 


bw S 
i a 


, an a ey 


Se 


considered inferior to the other two. In this method, 
the same test is supposed to be repeated after an interval 
great enough to eliminate the memory effect and yet not 
long enough for much true growth in ability to take place. 
The question is: How judge this interval? It is logical 
to suppose that if any considerable period of time elapses 
a natural increase in ability will take place if normal 
classroom conditions obtain. On the other hand, if the 
test is repeated a very short time after it is given, the 
pupils will be bound to remember at least some of the 
answers that they recorded at the previous sitting. 
According to Odell, the product-moment method of 
correlation is generally considered the standard a ie 
It would probably be appropriate at this time to illustrate 
this method for the benefit of the reader. Suppose we 
consider the following test results from two equivalent 
forms of a test administered to the same group on success- 
ive days under identical testing conditions: 
An example involving the Calculation of the Coefficient of 
Correlation by the Product-Moment Method is given on the 


next page. 


(14) Odell, C. W., Educational Measurement in High School, 
pp. 588, The Century Co., 1930. 


" 7 a Ps 
ica nat eesorbar | 


ido: uno 1ALbaoo a at 60’ 


fe Calculation of the Coefficient of Correlation’ 


Moe ty | x |v [xe [V2 8 
ie G 
Mee | 39) f) 3] of] 3 


56 | 
Meee 1 | 2 | 169 | ee ee 
Mees | 22 |-1o|-14¥ [100|196 [40g 
— 91 


- nr | a | —ee | 
| { as ND od a aren = i toi aticn — 
DR 2] HAR |AleI|ols]—loleles| SOS 
mi ts oe - f ~~ OTe me ay i [~~ | 2 ! 
5 WD = ) ne | Tess al ‘¢ mH (i - ¥ ee i 4 
ss fe, =A te 
| 


a 


’ 
Me . - 
} 

j 

; 

; 

5 

p 4 
vy Hib ih, 
Lye ia 
5 ‘ ti 
a | ‘ Ni 
oy ea 
Ani 
' r 3 
- 


759 = x i 2 
mee as) VSS 
ee Pees lh. AO 


Cee —14.44 J 64.64 —-12.6736 


47,392 | 


Vor.i2 8 64f 52.1664 


Bo x 


fee (77 f 7 22265 


‘a Veiga dpe age Ne ft = 
pesscg76 6 O57 


[1 


The Correlation Detwean 


iminations A B is therefore O.¥/ 


; 5 
ay ~ a ' Ny 
rhe? sah 
| aby ot 


aus 
hace 
Wp) 


4 - 
. v Ps 
f> 
iy ve) 
ta \. ' Ni 
s | 


* i, 
yon | ot * 
; ny 7 
A | 
lv . 
At | 
Part. ; ; 
ee 5 5 
ve if 
oh an 


Professor Odell issues the warning that other infor- 


mation should be obtained about the reliability of a test 
besides the coefficient of correlation. He says, "Although 
the coefficient of reliability is probably the most fre- 
quently given measure of reliability, it is not very 
satisfactory because its interpretation depends largely on 


(15) 
the range of ability in the group tested." He suggests 


that other measures of reliability be used to supplement it. 
He proposes the following measures of reliability: "There 
are four measures of reliability that are commonly employed 
i connection with standardized tests. ‘hese are the 
coefficient of reliability (r), the standard or probable 
error of measurement (0 meas. or P. E. meas.), the ratio 
of this error to the mean (m), and the ratio to the 
standard deviation per eka is no doubt but what this 
additional information should enable the test expert to 
judge better the accuracy of the test under consideration. 
The drawback, however, is that the use of the mathematical 
formulae involved is beyond the ken of the ordinary class- 
room teacher and would have the effect of muddling rather 
than clarifying the issue. 

The next point that will be taken up aoe factors 


that influence reliability in a test. Symonds suggests a 


(15) Ibid, pp. 61. 
(16) Ibid, pp. 60. 


(17) Symonds, Percival M., Measurement in Secondary 
Education, pp. 289-295, the MacMillan Co., 19350. 


a —_¥t oe 2 t<. : 
i © = i 
La a - 7 " a 
ne / 
at } J Ds fa be: ‘, i ) 
O. an we : Re “a 
' ls A a | 
‘ “ “a om Ping: - 
~~ p 7? 
:~ . i on ~~ 
p- en i oe 


* -coted xeité tate fete ett seavel fiona 
ent « YO Wehhtéslies sat -eega behialdo s” 
danedt£a™ eves sh ,dotteletios io rgieined ont 
; -eil va@om edi yidedetg.e: vettinetian te sastelt se 
view Woo of #1 .yPbitdstiet Te eTgesea eo 
se viessel ebcegeo coisatengrétal aft eateseg, wat 


etheamus BE ™“\ besvasd quota ead at gvitide 39 
- : we aa es mh 
tl Snemetagas oF heey od YUiLidellier Y: eo useeem % ita sear 


Te si igeiiot te setmernem palwoliez od? ° 
<4 bexcolams yYindames ete tady yl itdsiies xo sosuaen % 
dt eve ezed steed fortbasboate addin nolsss 

<q 26 braboxte sad. he) yeifided ist ig i 


oe 


ad i .caem .6 .% “0 »886m 3) adaetbisem 
otter ead. bag. (em) seem ede oF sos & 

(az 
+ lad® ted tegen om of eredT *,() adtaelveb 


xs 303. ent ofdans. binode ool veppetent pees 
Ee wae 


iteyohivecs tebhns Vter-sat te yostaces eit “reste 68 
aolvomedtem ad? %o ast eid, tedd et _,2eCSNOR) eemame b 
-seaiv xtecisio edt to azex ed? Sncyed at beviovat ests ni 
siipnsoe ko Soette ono even bisow Bas sodoaat 


. 
Pa 


onset edd enn 
srostiet adv wto-qe netat od (iiw taae talog mt ot 


AS 


‘ S2SSR908 2 buomrl 5BG0 ~ 8 ol “i iildeaifet : : 
- 2iv-  Betebienvs ed tert caay ssoneditad ; 


x! —“A-S aan see ne aa anna et een an kets ern ae : 
: iy ge ere 

es as 

308. 1G 9 

eteancosk af tnemetpesel ye Lavtozot ree 


¥ 


.f52T Pay 84) 0 nalitwoat odd ymin i} oe 


<5 - 
- ) 2 


1. Objectivity - A test paper is said to be objective if 

@ number of correcters, working individually, give it 
exactly the same score. If the judgment of the correcter 
enters into the determination of the score, then the test 

is called subjective. This subject will be treated in 
greater detail later in the thesis. 

2e Length of the test - This is a cogent factor in consid- 
ering the reliability of testis. Tests or examinations 
covering a certain unit of work have to contain a reasonable 
number of questions in order to cover adequately the 
instructional materials. A fuller development of this topic 
will be presented under the topic "Comprehensiveness", later 
in this thesis. 

Se Evenness of scaling - Care should be taken to include in 
the test items that differ from each other in degree of 
difficulty. From this standpoint, the test items should 
cover the entire range of difficulty. Odell says, "Another 
factor which influences reliability is the scaling or 
arrangement with respect to difficulty of the items or 
exercises in a test. A test which does not have a large 
number of very easy or very difficult items but has more 
items near the middle range of the ability of the group to 
be tested is, other things being equal, more reliable than 
one Which does not contain such items. Also a eh which is 


scaled in finer units tends to be more reliable." 


(18) Odell, C. W., Educational Measurement in High School, 
pp. 67, The Century Co., 1930. 


~~ oF “Ss 


i oe q 4 ail. «td VT + @av 


A i , J ‘ Le | ce 
. add 2t vegoee emseMmeae 


te | | ris’ .2faitetem fenott ou coer 
° af ' ; ] — 43 
ve : . oS" oleot oft tebur Beveoserg bats sae, 


7 A 
. 
Ops 
.! alesis el 
o 
: 
Pease ste - an iiesse ce 
ao . Tyor 
b n ‘ ‘3 
= ; role 4 aa 


ca iL it 
Ss 
: : isi of 
4 gi ented 
7 ae ° 
vy 4 + o 
oral mae 
Fie s dotdw toed .taet 2 so eeelot 


: i 
- “a aspt - tired r 2feottkib wisyv Ie Yeas vaer tere . 
: A er 


3 + 
<- 
4 
a 
a 
“ a ~ = ? Ps 
w , > / 5 4 >A 1) = 
: S 
s 
~ « ry 
= oes v 
- 7 < . 


4iiev axon ed of eabmat? efias, tes 


‘ oe 


+* 2 oe oe Ce 
40? goth gf Faaenetsesen 


68. 


Professor Symonds illustrated the inter-relation of 
reliability and scaling by a little example. He demonstrates 


by means of two figures, as follows: 


Figure 1 - First Test 


mm OM Od Od 
MMM HM OM 
Mom OM OP 
MoM oP OM 


Figure 2 - Second Test 


XXXKKXKXXKXKXXXXKKXEE 

He wishes to emphasize the point that the selection of items 
should cover gradations of ability. "By selecting the items 
in a certain way it is possible to construct a test of 
twenty items so that it has no more reliability than a test 
of less than ten EE A Oe in Figure 1 the first vertical 
series represents five easy items that everyone passes, and 
the last vertical row represents five very difficult items 
that everyone fails, then these items are practically 
useless in differentiating individuals, or for showing 
exactly what any individual's ability is. The reliability 
of the test, then, would be hit directly because of this 
faulty scaling. Now, consider the situation in Figure 2. 
Here the items cover the entire range of abilities contain- 
ing probably one or two so easy that all pass and, probably, 
One or two so difficult that no One gets a perfect score. 


Symonds concludes that: "If the items of a test are equally 


(19) Symonds, Percival M., Measurement in Secondary 
Education, pp. 290, the MacMillan Co., 1930. 


~~ 


snolé¢siats 1070) t bin 


> al > 
v Silis 


egedt sed. nade 


siaose 407 ¢ 


- ww 
28% 


spaced in difficulty as in the second figure, there is no 
such loss in reliability due to coarseness of the measuring 
SAGES s) 

4. Conditions of the pupils taking the test - Professor 
Symonds gives the reader to understand that while the 
condition of the pupil at the time of taking the test is 

a factor in its reliability, yet the pupil's condition 

does not merit as much attention as some writers have given 
it. He says, "Evidently the cause for test unreliability 
must be sought elsewhere than in the general condition of 
the individual. fThis does not mean that one should entirely 
discount these aa says, "The conditions under 
which tests are given also exercise some influence on the 
equivalence of scores although if reasonable precautions 
are taken this appears to be slight. For example, results 
Will probably agree slightly more closely if the two forms 
of a test are given at the same time of day and perhaps 
even if given on the same day of the week, but in most 
Situations this is not ae hay might cause a serious 
disagreement between results, however, if one form of the 
test is administered in the morning of one school day and 


the other form administered late in the afternoon of a 


Successive day just before the pupils are dismissed to 


(20) Ibid, pp. 291. 
(21) Ibid, pp. 295. 


(22) Odell, C. W., Educational Measurement in High School, 
pp. 68, The Century Co., 1930. 


Vist leww: +: ‘@ spo Jed? msek Oi gecé& sit tentteibe 
rg } 


st erad’ ,s Slt bnooam ea? at 5B eet 
alae 

iS lo s¥enessscco oF amb » atthe es ah 

~ 16 : 4 ha ‘ 

sre 


A 


c ' 
eA Ut Sues Fe) } ree 


He texe Suet adf erlitss siigeg ont te cuotelt 


fidw tod’ igtempess OF Tehees oad nevis 
; ae 4 a . 
edt ooised to.snlt sat. te ogee edd re 


OLTLEsds o/ Liqtq sat Coy 1y Litdetlon atk ant 
TS ooICe8 Ge noisnetd ta Aouet aa sit 
iiealleris daot tot eedeo: ede gitagmiee: 


ssizbawos fatense edd ni sett etedeeeae eae 


: Ae? 

eno tibnede tat” ,eyaa £Leb0 < axotoa® ened 

it no ecoepiial eee ealvtexs cate aeyig one: | 

acoi¢ssserg sigencenet 31 sageds le 2 261008: to 

et + .slamaxe toe. tapiie e¢ of szaegge ead 
£0 F 40 .i¢ tt yloeoie eran Tietetie RE 

to swid emen od? te se dad tet 


+0 at ores eld me nie 


Lt Aen ye 

bne vb ipo efoto: gelavom edge ak acta ec Lit 
‘2 er rae 
oe ae 


to penateotte ead of eist berptelatabar ted 


+ beseltmelLp era eliqug edt ieteied teak | 


ce ae ae eae OO ee Oe ee et eee ee Oe ee ee 6 an > epee per a ae Oe ee | 


,0o0nec agit ni J name weaedt isoetheoumi ie. oe a 
S88 oe banc she & 


attend an athletic contest or some social event. To 
obtain approximately the same scores on two forms of a 
standard test, the identical testing conditions should 
be maintained. 
5. Familiarity of the pupils with the technique of 

taking tests - Many test experts now agree that pupils 
should be taught to take tests. This is particularly 
true in high school with pupils in the social-business 
studies. In eight years of teaching in the high schools 
of Massachusetts, New Jersey, and Connecticut, the writer 
has observed many cases where pupils were unable to do 
justice to a test because they did not understand the 
question in the objective test form into which it was cast. 
In a study that Professor Symonds made, he showed that much 
of test unreliability is due to lack of training in the 
technique of taking a test. In his conclusions, he says, 
"I would like to suggest the possiblity of lowering test 
unreliability by means of systematic training. Pupils 
should be taught to take ety gee pupils are un- 
familiar with the type of tests used, the scores at the 
first testing will in all probability not be as represent- 
ative of their ability as those secured later. 

Professor Odell calls attention to the wording of 


directions to pupils as affecting the reliability of a 


(23) Symonds, P. M., A Study of Extreme Cases of 
Unreliability, Journal of Educational Psychology, 
15:99-106, February, 1924. 


fac © olte. 


¥ 


owe nO eet0De oman eatt viet antxorg 


a ‘ded: d ; 


P. f+~akt A 
SOl9 bi ett 


oir & Was 


ey 5 -Yeetet wae peer 


. 67 6629 YAss bevseado 
hot '¥* ed? esseoced taet & of eodd 


; ei het iet evidootdo edd gi noltee Gy 
° 7 + —_ 3 ad ee 3 
ois Teal Yosee 2 


ees | 
2. 


@ 
o 


ys by ee 
y Fonopletdeal fast se anivat So atetatesr | 
eter ? dinate deel 


ee 
Lepog — art ug or eakl binow EP. 


ne hs a Seen 
: : ‘ 7? oistsierete Lo aneem yd ee ge 
“90 pte i *yeteov edad ed oiguak ag! bitede 


- 
© . > = do 
eRay- . i ey - So Sy bite vif 
oF ‘+ 
‘ . rf a ¢ * 
: é w s BAT? a J ~ & mh oe 
- “ 


‘ ’ es te 
5 tte efifeo £fes0 sosaetort 


~~! * hw . 2 + + oto be if 
Ais OO 34 es elisgag oF 
e ‘* 
, 
ee td et ee ee Oe 


3 emette® to ybet® 4 , ve cee 
fevoltaevha io Lenuget , yt iiicwt ios 
S266 Vrastdes 30D tes 


test. Full directions should be given the pupils, either 
in the test or by the teacher, so that there will not be 
any doubt in their minds as to what is expected of them. 
He suggests the following points which the directions 
should provide for: 

"1. State briefly what the test is about. 

2. Instruct pupils when to begin and where to 
stop work, when to turn a page or not to turn 
@ page, and so forth. 

4e Direct pupils whether to delay on each item 
until they have answered it or to go ahead if 
they do not know it. 

4. Make clear the form of recording answers, 
Whether by writing words or numbers, under- 
lining, checking, or something else, including 
a fore-exercise to illustrate the method of 
response unless pupils are already thoroughly 
familiar with it." (24) 

What is a satisfactory degree of reliability? 

The answer to the question must remain a qualified 
One. Before any answer can be given, knowledge must be 
had of the individual test and the function or functions 
that it seeks to perform. Reliability coefficients are 
correlations and hence their magnitudes are influenced 
by the range of abilities present in the group of pupils 
used as the control group. Professor Odell says, "With 
thirty minutes of testing where fifty or more objective 
questions are asked one a Pag expect to get over .80 for 

25 

& reliability coefficient." 


(24) Odell, C. W., Educational Measurement in High School, 
pp. 67, The Century Co., 1930. 


(25) Ibid, pp. 299. 


en 
‘gos 
¢ 
A 
r 
aria? | 
’ 
‘ 
> 
e | S 
‘ss #4 


by bs ~ - 
Sh OL 200 Te 


ia. a 
et oe 
exib Liat 
af a = a, 


te 


3 
Pt 
- 
2 


Lats 
sa = via e 


1owxke-steoi BS 


© ~~ | es ~ ee 
j IsSHneCcesd 
rt 
woot £ 3 . ‘ 
¢ 
pe ae Pes F - 
toe a 
_ = 
ot 7m) Gare rt 2x" , » 
Ju Tho7UiLS Cold AL 
> rer e A Sal 
Ae cy Sa Bek 
4 é 
i Bf Bile 
oe! - ~ 
738d 0 exsec @ 


8) 


oned Baa enoftivail 


+. oe , 
: co ° 
al 3 ics ia] 1s 


a 


If the teacher desires to increase the reliability, it 
can be accomplished by increasing the testing time and 
including a larger number of questions. Teachers are 
coming to realize that in order to make testing worth- 
while, the test results should be reliable. If the test 
results are to be used for administrative or guidance 
purposes, the test should have a reliability of .90. 

Ruch and See ae formulated the following 
table to assist the classroom teacher in interpreting the 
reliability of a test: 

Reliability Coefficients 


0.95 to 0.99 Very high; rarely found among present tests. 

0.90 to 0.94 High; equaled by a few of the best tests. 

0.80 to 0.89 Fairly high; fairly adequate for individual 

measurement. 

0.70 to 0.79 Rather low; adequate for group measurement 
but not very satisfactory for individual 
measurement. 

Below 0.70 Low; entirely inadequate for individual 
measurement although useful for group averages 
and school surveys. 


The authors hesitated to make the above statements, but 
decided to because they felt that some concrete criteria 
was due the reader. 

Objectivity 

The practising teacher recognizes that it is desirable, 
nay indispensable, to mark all pupils on the same basis. 
Depression conditions with their corollary, increased pupil 
(26) Ruch, G. M., and Stoddard, George D., Tests and 


Measurements in High School Instruction, pp. 56, 
World Book Co., 1927. 


7 o~4 ya 
ore oe Go idea). ies 
E 7 wae i 8 
A ee ae 
mirdg : 
a eK — 
s > 

ion 
s 

‘ 


rfiteeller ei? orse test. o¢ eaxideb + 
wilt watfeot edt so keweteal qo bette ds 
fe eSxzadtnnez -wnotzeets te <pdnee <epiete 4 
-ii% wireed olan of tehto: af. gant esiiset ot ‘ 
Pa) eee sidelisx €6 givers eticguert tans” 


snabfen to. avitti cielimsee set Send rae 2) 

oe. 6 ‘iiiégatie: so evar sone Spe eat 
aR 

Spe sep y3 15 Ce © 8: ioerrted evad otebbore baa 


lagevetecnl af ssdoget meniseato eat Teless 0; 


staet a to. wig 
etne Leitz ted ee 
.afeed tre-erg proms bovot Yletet ve . 
area? teec eae lo wer 8 qa be Lal so , 


iasfhivtiat tot otaupetes “ltiat ids ph 2 
oa Aetie ts 280m at 4 
Pisaesdss Bes EOTR. 104 etanpobs jmol 20d¢RB) 
 “Teppteteat vol yvzotratuteas yiev Jom 7eees 
tise 1SeseM 
Sgnbreliali 137 sisupebent “iervivae Fook 
<ate tf TS Tar ie.6eg SRA ont is tosmeweeem iy 
»egevise Loonies BASSs 


fi speueceatsa evode ea? efam ef betetived s1wdtwe et ; 
eivetixes e*torenoo eaoe tadt tler yea? cevseed oF J 
sonsee 6 


sidasi#eb gi-ti ads pesingoses ze0G8ed gant 


_ J 4 lf 
isa see 64 ao esiiqog fis. 2ted-oFf eee: : 


‘iy begeeadudk .veRifotes wtedt ate anolsipnos 


ow 
ie 

a 
< 

t 


Se ee ee << ae eter oni aion Shhibenciaatie Maat : 
»> 


ne ae ,.4 enuced ,Biapsete fas La 
66 «¢g ,nolios eat Foods® oglE ni. 


sTSeL> a2 


ee 


loads, have caused the teacher to avail himself of the 
new-type test procedures in order to save time. Take, 
for example, an ordinary class of thirty or thirty-five 
pupils in Commercial Geography. A test of five essay 
questions in this study might merely consume thirty- 
minutes of the pupils’ time, yet the correcting of these 


papers by the teacher would represent a long, tedious 


Ghore.e Notwithstanding this, there is no surety that the 


papers will be graded on the same basis as the teacher's 
judgment enters into the marking. The teacher, being 
human, is subject to the common human frailties. The 
studies of Woods, Starch, Elliott, Kelley, and others 
prove that the traditional essay-type examination is 
almost impossible to mark with complete accuracy because 
of subjectivity. "An examination should eliminate or 
minimize subjective judgments in scoring, and the degree 
to which it does this is called its Rane incoaee 

The lack of objectivity in a test is one factor of 
unreliability which can most easily be remedied. Usually 
if the teacher spends a little more time in the construction 
of his new-type test, he can obtain the desired degree of 
Objectivity. Symonds says "Of all the factors entering 
into unreliability, lack of objectivity is perhaps the most 


inexcusable, for it is usually possible, by exercising 


(1) Lang, Albert R., Modern Methods in Yiritten Examina- 
tions, pp. 53, Houghton-Mifflin Co., 1930. 


teeta 
Soo os i 3 IOz , ieee 


od -_— —a- - ~~ —_— “=< eo ee eh eel i. : 
iotnos ‘ ee 
7 - Pall 
bas i> aes Da ‘. ~ “ 
;<t : eve. {th tres 
- = = 


J ; 7 ; yan 
= as - w i” Abe ee ee LiLo 


74. 


sufficient ingenuity, to turn a sudjective test into an 
Objective test with little or no loss of validity to the 
a ae 

That the objective test leaves much to be desired in 
the way of actual testing achievement is called attention 
to by many critics. ‘their argument is, in effect, that if 
the test is made completely objective it merely measures 
the acquisition of facts rather than other more desirable 
Outcomes of the educational process. Professor Brueckner 
says, “The scope of these tests should be broadened so as 
to include the outcomes of learning such as interests, 
appreciations, ability to apply, and the like which to many 
are even more oe; aa than the outcomes that we are now 
able to measure.” In an illuminating article on neglected 
aspects of educational measurements, Professor Uhl, after 
a brilliant survey of the problem, concludes that: "Measure- 
ments as usually administered fail signally to appraise 
certain of these forms of iia ey 

Objectivity in tests carried to extremes may encourage 
the development of dogmatism on the part of the pupils. 
Many times the pupils get the idea with odjective tests in 
Commercial Geography that a certain answer is right and 
that no other answer would possibly do. They think thus 
(2) Symonds, Percival M., Measurement in Secondary 

Education, pp. 290, The MacMillan Co., 19350. 
(3) Brueckner, Leo J., the Validity and Reliability of 

Educational Diagnoses, Journal of Educational Research, 

September, 1935, pp. 4. 


(4) Uhl, Willis L., Some Neglected Aspects of Educational 
Measurement, The Journal of Educational Research, Decem- 


ber, 19353, pp. 241. 


ae oi eh 4 


Tew. eng 
P =) ie A ot 
: isieipaos env 


7 sere) 
, Bae Jal se toCcse 
tue! ‘ai Geve o¢e 


Y 


j ; 
* 
: i " 
. Ph) 
nen f : ia 
r¢ ; earoleveb sav 


‘oe tedvo on dan? 


C 

a 

a] 

® 

Ps 

C 

—s 

=H 
a | 


wot 


{ 
> + 
3. fog 
es 
Ww 
2 be 
72° 
a eC 
> 


e. 
sa 
se 
“al 

D 


notwithstanding the fact that the proposed substitute 


answer is merely the original answer paraphrased. That 
this tendency should be militated against by the teacher 
is obvious because, after all, the social-business subjects 
are intended to inculcate a liberal outlook. H. A. Jeep 
says, "Because of the impetus which objectivity has given 
dogmatism, the present day test often tends to block progress 
effectively along other lines of EE 
Other Desirable Characteristics 
I. Comprehensiveness 

It is doubtful if a pupil's total knowledge of any 
particular subject could be ascertained by existing test 
methods. If it were possible to secure this it is doubt- 
ful whether the information obtained would be commensurate 
with the time expended. Consequently, the situation 
resolves itself down to the query: "How obtain the requisite 
information about the achievement and educational develop- 
Ment of the individual pupil with a minimum of trouble?" 
The practical teacher takes advantage of the orinciple of 
Sampling and infers pupil' knowledge of the subject from 
the sample taken. It is patent, then, that the examination 
to be reliable must sample thoroughly. "It should cover a 
wide and representative scope, and the Sent to Which it 


does this is called its comprehensiveness.” 


(5) Jeep, H. A., Must Objective Tests be Dogmatic, Education- 
al Administration and Supervision, March, 1933, pp. 181. 


(1) Lang, Albert R., Modern Methods in Written Examinations, 
pp. 54, Houghton-Mifflin Co., 1930. 


‘lesedotqwod” Wt 


ai edt vedtenw fet 


“si 
; 
re 
¢ 
. i 
’ ~ 
+ ? S 
’ 4 _— 


J0 eldatised teazO 


‘wane IgSet ALS ehiw 


Oe an Oa ewan ee aa ae 


+ anidnereaiciwgos 
Y "am Gi Terenas 
‘one bread aeiad 

ad anotydo’ ei 

J DOOTETAL STS 
agueood™ , 8¥se 
,weld eage dh 


vlevivoeltis 


osidus teinsoliseg 


J : .@hoavem 


i? Aviv 


{fe20 ef ailing Beeb: 


A GE Gaeta 


rt 


Sufficient questions should be included in the exam- 


ination so that all phases of the subject are adequately 
Sampled. For example, suppose that the Economics class has 
just completed a unit of work taking three weeks time. It 
is obvious, then, that a test requiring about five minutes 
would give an incomplete sample of achievement. Odell has 
this idea in mind when he says, "Both from the standpoint of 
the content of the test itself and of the reaction of the 
pupils, it is in accord with common experience and proven 
by actual experimentation that if the length of a test is 
increased up to a reasonable limit, its reliability is 
Dibes eae daacestcais. a@ point can be reached in the 
examination, however, when additonal items will have but 
little material influence upon the results. Professor 
Odell's rule regarding the length of tests is contained in 
the following statement: "It has been shown for similar 
tests, that is, tests covering the same subject or phase of 
a Subject and containing the same types of exercises, the 
reliability of a pupil's scores increases 2 hes as 
the square root of the increase in ‘t.* In other words, 
if one of two similar tests is twice as long as the other, 
its reliability is approximately 1.4 times as great. 

The examination should be divided off into fine 


Measuring units just as a yardstick is. It should not 


(2) Odell, C. W., Hducational Measurement in High School, 
pp. 66, The Century Co., 1950. 


(3) Ibid, pp. 66. 


contain many items so easy that everyone passes, or con- 
versely, so difficult that everyone fails. Odell says, 
"Another factor which influences reliability is the scaling 
or arrangement with respect to the difficulty of the items 
or exercises in a test. A test which does not have a 
large number of very easy or very difficult items but has 
more items near the middle range of the ability of the 
group to be tested is, other things being equal, more 
reliable than one Which does not contain such oe 
II. Facility 

The modern objective test to serve its full usefulness 
must be a time saver to both the teacher and the pupil. 
The greater amount of time spent by the teacher in the 
construction of a new-type examination is offset by the 
reduced scoring time. This should meke for greater teaching 
efficiency as the time saved might well go into better 
lesson preparation. The thought seems to be gaining 
momentum that the teacher's time is too valuable to be 
expended in the correction of poorly written material when 
the entire testing could be expedited by a test capable of 
being scored in a more or less mechanical manner. Lang 
says, "An examination should be easily administered and 
scored, and the degree to which aye this essential 


requisite is called it facility." 


(4). Ibid, pp. 67. 


(1) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 56, Houghton-Mifflin Co., 1930. 


» YLeetey 


= j-web §% 


f A *rnee 
tCe Ofe 


og eee on = 
efisaites 


re 
i O'S ais 


7’ 
“oy i 
fit DSt 


+? = r7r . 
de SFr Lei roet 
~———— oe -— oe 


> ZG 


The facility of an examination involves two other 


ideas, vize: necessity for definite instructions and the 
cost of preparation. Pupils taking the examination should 
be instructed as to what is expected of them in the 
examination. In general, these instructions should be 
contained right in the test rather than be given orally by 
the teacher. All of the better published objective exam- 
inations contain concise explanations together with fore- 
exercises in order to explain to the pupils the nature of 
the exercise that follows. Symonds says, "I would like to 
suggest the possiblity of lowering test unreliability by 
means of systematic training. Pupils should be taught to 
take | of the second idea, cost of 
preparation, it is obvious that a published test would 
lose its facility to the ordinary teacher if its cost were 
prohibitive. It is considered better with teacher-made 
tests for each pupil to have a mimeographed copy of the 
questions rather than to have them written upon the black- 
board. ‘The cost, then, must not pe allowed to become a 
Serious factor. 
III. Utility 
This characteristic has to do with the practical use 


to which the examination can be put. Hducational practice 


(2) Symonds, Percival M., Measurement in Secondary 
Education, pp. 295, the MacMillan Co., 1950. 


- ' a 7 - 
on ; A r . id 
- ; x n 
ud 7 
. : + 
pos - ) ¥ ' 
i 
4 4 ‘ 
e i ; rong 9 
r 
a! 
?' a | 
; : 
: 
; 
: - 
“ t ‘ 
- . 
¢ . . 
“ 4 
4 . Y 
f 
- 
’ 


te 


has progressed long beyond the stage when teachers gave 
tests merely to provide “busy-work" for the pupils. 
Equally archaic, also, is the practice of some teachers to 
give an impromptu written test as soon as the supervisor 
or principal steps in the room to listen to the lesson. 
Such practices belong to other days and should be eliminated 
if they reappear. We can see, then, that the utility of a 
test to the individual teacher will depend upon the latter's 
education, teaching experience, and educational philosophy. 
The progressive teacher plans his testing program, with the 
assistance and advice of his supervisor, administers the tesis, 
scores and grades them expeditiously as possible, diagnoses 
the results for specific weaknesses, and then plans remedial 
instruction to cover the particular shortcomings that have 
been uncovered. Lang says, "The utility of a test is really 
concerned with educational diagnosis and with its adaptation 
for changing scores into meaningful kee 
IV. Rapport 

One's interest in a task has a great deal of bearing 
upon Whether it will be done or not. It is human nature for 
the pupil to do first the things that interest him and to 
do grudgingly and half-heartedly those things in which he 
lacks interest. No teacher can escape this situation today; 


it devolves upon every teacher to motivate his work so 


(1) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 59, Houghton-Mifflin Co., 1930. 


o™* 
=e 


in Je 


that oe smouldering interest of the pupils is fanned into 
a bright flame. This attitude of the modern teacher 
should carry over particularly into the testing program. 
"An examination should create a feeling of interest and at- 
easeness, and rapport is the degree to which this is ae 
If the test is made as interesting as possible, the 
pupils will attack it with more zeal. Above everything the 
test should be planned so that the pupils will be satisfied 
with the fairness of the results. One of the beauties of 
the objective test is that arguments about the test mark 
are practically eliminated. An examination should begin 
with a few easy questions that all can answer. To do this 
insures the proper mental "set" when the pupil progresses 


in the test to the parts requiring more concentration. 


(1) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 59, Houghton-Mifflin Co., 1930. 


- ny id va 
feetetal- ~nizebisons sav tadd 


enitisoas 


~~ ee en ee me ete Oe ee ee er ee Oe ee oe ee ee 


ae 


ew 
e vi Ws \ r 
if > - ok a it a 
\ 4 aa ‘ 
“J 2 is 
i. 9 
4 cae ij 


al 
by 7 


Tie etd .emealt tasigd as: 
s5itisq tevco- vita Bigess 
9 Siivdse soivaniuee eae 


ai ixoqdet One (286neeas 


9 E ii! pa £ 7 Boa 6a t rz : 
jostca Iliw aliqua 
e Seanneiq et bivoag tee? 


sag te ecaentlal edt date 


snolisesp. yase Wel # ayia 
(afanem@ teqgot¢r od’ sergeant 


etusq. edt od teed - pasoad 


( , fi tred ls pee (x) 


E. STANDARDIZED TESTS versus INFORMAL, TEACHER-MADE TESTS 


The test expert usually illustrates what "standardized 
tests" are by explaining the differences that exist 
between them and the informal objective tests. Many times, 
the one becomes the other after a long period of seasoning 
and frequent revisions. We might say, under these conditions, 
that the standardized test is the informal, teacher-made 
test that has "graduated". Usually the standardized test 
represents a more scientific and accurate instrument than 
the teacher-made test; greater care has been given to its 
preparation; its validity and reliability have been insured 
by dee tnawias se statistical procedures. In degree of 
Ref ixoncnt. the standardized examination is to the informal, 
teacher-made test as a rapier made of the finest Milan 
steel is to the ordinary butcher knife. Care should be 
taken, however, to make sure that the "Standardized test” 
advertised as such is really entitled to be so described. 
Ruch says, "Many well-known standard tests are in fact 
fairly described as more-or-less objective examinations 
with naa 

A genuine standardized test must, however, meet much 
more stringent requirements than mere possession of norms. 
First of all, it should have "demonstrated validity rest- 


(2) 
ing upon some more secure basis than personal opinion." 


(1) Ruch, G. M., The Objective or New-Type Examination, 
pp. 158, Scott, Foresman and Co., 1929. 


(2) Ibid, pp. 138. 


Oy ‘hey 


=. 


wim vecarieniii yilavse crate Teaa?t 
ettz nis iexe {oO ete “ate zt 


ae etn: ‘etd ona mest? nsomted 
vn tae 
» tatta tecte ade speaddted ene’ eden 


<sing .¥se txele ef .asebelyes jnanpert bas © 
| Bd S69 ep 
set tesanbeta® ous tend toot | 

ofticnsles oie 2 esnesexyex © Mf 


-o se7 a batw- pee es x 
ta 
idsifes éns ~asTiev svi {sol taraqeng 

oh el .xpuebecotgd Leoivelsets efietages edd: ws 


ei. ma tinexs Esti biabaisie sav snoaoanies re 
ea 
Sa EE EG aE gz 28 028 sot 0 beara red eee? ; 


ae 
ae 


ye o% 
aor 
ot Pe sre ip) > egi<siS Tr BAtH20 ads ot elt foe be 


Pe chaate” che dSis eute ee SF -yevewod meses © 
igaa vilseex< sf oege ae bosttievsae’ “4 


G's we a 


piapheda nivtal-Liew Yaa Sees dons * 


BX: Arse ~orom ee bedizoeed yirtsr , i 
; / if} 
* em1on. atte | G 


; i wie 
+ ,atibiebasts enipasg se eae 


start edt sideteiigpet torgntate exon 
jtenoued” svat bewode to eeae “ 


= Aa 
‘ead exizdse stom emoe nogs § 


— <n ee chen ne Ee aha = tl alae was 


) 
a 
ae. 
“ 
‘ 
‘ 
we 
tm, 


suivant eayierel. to. evivoot ae eat gad a 
0802 ,.00 bre cambesot Digi 2 “ 


Its validation should be insured by the methods described 
previously in this thesis. All “dead timber" should have 
been eliminated, and the examination should really test 
what it is intended to. An instrument sufficiently refined 
to accomplish this must, of necessity, be the product of 
careful experimentation. A specific weakness of standard- 
ized tests is that they are not directly applicable to 
the local school situation but represent the generalized 
conditions prevailing in school circles. Ruch says, "The 
validity of most standard tests is open to discussion ; 
when we consider that local conditions vary so sgheessee 
Undoubtedly many standardized tests do compensate for not 
meeting the local school situation and can be used with 
little or no adaptation. The great majority, however, 
Should be viewed with considerable suspicion. The logical 
attitude for the teacher to assume is that neither the 
standardized nor the teacher-made objective tests should 
be paramount; that each has values the other lacks; and 
that the findings of one should be supplemented and tested 
by results from the other. 

Secondly, a “standardized test” should have "demon- 
strated Barents iat? anes requirement looks to the 


accuracy of the particular standard test as a measuring 


ewe e ee ee eH Owe ew Ke SE SE Se eee eK ee ST ee SE Se SE ES SE SE SSE SE SS Se SE eS Se ee ee ee 


(3) Ibid, pp. 140. 
(4) Ibid, pp. 138. 


a ye oe ee ee 


, . et oe 


. se 
boviioesh ehonvenm off YC Beiuwers ec -tisede at 


vitasas oe yisaT enolsiboas Ieee h- sade iin 


rovyarvgnr (tT ixO¢ 4Zi 78953 en? -nolverqete on ee 
yet edi cada obgene 6faeceBLanoo meee ae oleh 


ia vevoal sive sat cagiev dad cess rene ja 


Lo 
— 
e~ a 
¥ Me 
j o 
~ 
7 onl 
r ¥ 


r 


: ried 2 is dmlS best” Lid iefesd?" eidt ap f 
-Finex Fisode nofishiness edd bas as 
tel §tire Pnewsts saz aA ase Sebaetal et #4 


meboyy otf 2g 27 1adeoedi re teom oiad Kelliqn 


i 
22-0 esentige® 6litiege sd Soca 


sy odd J#meeéuget tad detsaednes eagle 


~ 
4 


tree stent . -eetexig foctee af ant tieverq, 
Jinespelb af nwqe eft efeet Brabasze tom, 6° 

. pe 
oO: oteenequic® 6b vieev tealeasbaare Yhan ¥ " 


i 25. Sa Heo one not reisie fCoodos isool edz 5 


sitten fede dt ostides ov vodensd elaraee 


vic oe{do efam-ceiese7? Ong, 7 beats: 


Pa 
< 
Piel 


sevelqgce od Sirens emo Eo tient a 

»TEATO dont wort @ 

a” svet Sinote “tee? bes Lore pee oF a” ce : 
of ‘g:inet tdeme tis ges sneey tL be 
ingagem & 3 #60? Sustnets Galioigieg ont * fo -yosrs whet 


ee ee ee + netinuronne ——w a 
; ' 


83-6 


instrument. Professor Ruch issues the following note of 
Warning: "Unfortunately there are not a few standard 
tests, held rather generally in high repute, which yield 
reliabilities far below those ordinarily to be obtained 
by a thirty-to-fiftty-minute informal objective classroom 
test of the unstandardized ee os as use of sucha 
test might result in a gross mis-measurement of the pupils 
and it is doubtful if the teacher is justified in using 
it in spite of the availability of norms. Conceding that 
there are defects in the standard test, Professor Symonds 
states that: "Scores on standard tests are not absolutely 
accurate, but they are accurate enough so that we may 
place considerable confidence in test eG er is 
doubtful if Symonds intended this as a generic statement 
cOvering all classes of standard tests. What should be 
done is to examine the particular standard test included 
in the testing program to see if it meets the requirements 
of a "standardized" instrument. Otis' "Scale for Rating 
Tests" might be helpful in this capacity. 

The third requirement\ or a standardized test is that 
it must have “a reasonable degree of objectivity of Boilie: 
This requirement directs attention to one of the main 


differences between the standardized and the informal, 


teacher-made tests. As the former represents much more time 


(5) Ibid, pp. 139. 


(6) Symonds, Percival M., Measurement in Secondary 
Education, pp. 299, the MacMillan Co., 1950. 


(7) Ruch, G. M., the Objective or New-lype Examination, 
pp. 139, Scott, Foresman and Co., 1929. 


and effort in the making, it is a more polished tool, 
consequently it has greater objectivity than the teacher- 
made variety of test. Odell recognizes this when he says, 
"A second characteristic of classroom tests is that 
although they employ to a large extent the technics that 
make for objectivity in standard tests and are therefore 
much more objective than the traditional form of essay- 
type examination, they are far less objective than standard 
tests, since they depend on the teachers’ judgment for 
their (eM is sate to interpret from this state- 
ment alone that the standardized test really has intrinsic 
Values that prevent it from being displaced by the informal, 
teacher-made test no matter how popular the latter may be. 
The final requirement of a standardized test is that 
it have “norms or standards for evaluating the results 
Obtained by the re According to Professor Ruch, this 
requirement is not as important as the preceding ones. 
"The most important thing in ali this discussion of reason- 
"able standards of attainment is the recognition by test-users 
of the principle that no one norm of performance can be 
set a Which will have universal validity for all pupils or 
all ‘eT is generally being recognized now that the 


norms on a particular standard test, may or may not be of 


(8) Odell, C. W., Educational Measurement in High School, 
pp. 58, The Century Co., 1930. 


(9) Ruch, G. M., The Objective or New-Type Examination, 
pp. 159, Scott, Foresman and Co., 1929. 
(10) Ruch, G. M., and Stoddard, George D., Tests and 
Measurements in High School Instruction, pp. 17, 


World Book Co., 1927. 


bi ~ F i b a Phe 
} . “ 1a 
aa ‘3 


rrofts fas 
rt 4 lau peaaon 
Jeltay oben 
g J 
Snioeoe # 


Ae aee SSCs 


e er nsnoart iz 


ifeotda tot sale 
o.cG SICce AOS 
 Selieninske say 


aS e0nte ,a08ea8 


“x lend 


Li.vVoe 


inex 


: 6b GiG2 
wey +: Sn 
tq ee 


wr s8S2O0Ckoe 1a 
Ee Bet s 9. BETed 


thems itaeeek 


Pe mors 


85-6 


value to the teacher. Before any norm can be used, the 
conditions under which it was obtained must be considered. 
Obviously, the norm based upon the results of the standard 
test in city schools would not be applicable to performance 
in rural schools. Arguing in the same vein, a norm obtained 
from high-ability pupils could not be used in interpreting 
the scores ot low-apility pupils; nor would it necessarily 
be fair to compare the relative teaching abilities of two 
teachers by making their respective classes submit to a 
standard test and then comparing the test scores with the 
norm. Such an idiotic policy would be manifestly unfair 
because certain pedagogical methods stress drill and the 
acquisition of knowledge, while others emphasize the 
gaining of appreciations and the establishment of desirable 
attitudes and ideals. All norms must be interpreted in the 
light of the local teaching situation. How is it possible 
to circumvent this limitation? Some authorities have 
Suggested that while one norm is inadequate, many norms, 
each obtained under different conditions, might be the 
solution. Thus we should have a norm for results in rural 
schools, one for city schools, another for low-intelligence 
groups, yet another for high-intelligence groups, and so on 


until all the varied conditions are covered. 


CO aS _ or’ as? . eo 


4 y 
~ + Pela: a oe am :? , 
a _ 
bd P Ae , , ‘iN 
\< io a Oia , 
ae ae 4 ' 
pe ed 
ute * 
Aa . 
‘ 
— 1, 


~1oaese) ef7 OF Bria 

a side yvebhay eneivibaes 
bofieessa Gon ent ,vissetved 
mloodhae vito ab Jued 


a. pris i. tan ss elooios Lester oe 


A , ror fes0) it “to Snaks 


: : Siti molsTnioe 


- ona Jeg ,Squers 


, -eftev edt ffe Iivag 


Some authorities differentiate between norms anda 


standards; others, use the terms as if they were synony- 
mous. Odell says, "It is unfortunately true that most 
authors and publishers of tests have not attempted to set 
up standards, but have merely reported norms and left the 
determination of standards to those using the ae 
norm represents the results of existing conditions in the 
schools surveyed; the standard, the final goal or ideal 
to be approximated or attained. 

In summary, the weight of opinion seems to be against 
the use of blanket norms because of the diverse conditions 
existing in the various school systems. Some authorities 
hold that a norm should be derived for each different set 
of conditions. Professor Ruch suggests that: "The con- 
structor of objective tests must seek other means of 
interpretation than through the use of norms. Local norms 
may be derived with the accumulation of records, and in 
the long run, interpretations may be made quite as accurate 
as practical demands CREA ge Other words, each school 
system may derive its own norms from the results of the 
standard test used over a period of time. 

The evidence given so far points to the inevitable 
conclusion that the standard test today is not a perfect 
(11) Odell, C. W., Educational Measurement in High School, 

pp. 452, The Century Co., 1930. 


(12) Ruch, G. M., The Objective or New-Type Examination, 
ppe 66, Scott, Foresman and Co., 1929. 


Ps 
ie A 
« BN 
ro 

. 

I 


+ 
el 


: ¢ 

~? 
“ 34 
e 


‘ 
cyt 
-_- 
- 
“- 


87. 


measuring instrument. Does this mean, then, that the use 
of standard tests should be abolished? Even harsh critics 
would not suggest this. The solution is to include both 
informal, teacher-made and standard tests in the well- 
rounded testing program. Gale Smith says, "Everyone now 
realizes that standardized tests are indispensable in 
school work, but from the point of view of the classroom 
teacher, of the supervisor, and of the administrator, the 
greatest opportunities in testing today are in the use of 
new-type, objective tests which are not 
Odell concludes that: “One cannot avoid the conclusion 
that no testing program for a semester, year or other long 
unit of work in a subject can be well balanced unless it 
includes both standardized and non-standardized tests. 
Ordinarily, if not always, the number of the latter should 
exceed that of the former, their respective proportions 
depending partly on how satisfactory is the supply of 


(14) 
standard tests available in the subject being dealt with.” 


(13) Smith, Gale, How to Construct and Use Non-Standardized 
Objective Tests, pp. 8, The Benton Review Shop, 1929. 


(14) Odell, C. W., Educational Measurement in High School, 
pp. 4735, The Century Co., 1930. 


~tv y au.) i.e ae 
vt : fe el 


*® 


F. FUNCTIONS OF TESTS 


It is futile to attempt to plan an examination until 
the function or functions to be served are clear in the 
teacher's mind. Lang says, "The function of any activity 
should be determined before the procedures and materials 
are DES OE each test or examination must be the 
test-maker's purpose for the individual test. Tests or 
examinations that are given merely for the sake of “busy 


work" are an educational crime. Many times teachers will 


give written work, supposedly as a test, and no sooner 


than the class has filed out of the room, will take the 
papers and relegate them to the waste basket. TYWhen 
Ghallenged about this practice, they offer the lame excuse 
that the pupils got the benefit of organizing and writing 
down their thoughts on paper anyway. It is doubtful if 
the giving of "tests" to keep the pupils occupied for the 
period is a justifiable procedure in the light of modern 
educational theory, and it is certain that such practice 
would be severely censured if it came to the attention of 
the state supervisors. 

Just as each course of study has aims and objectives 
to fulfill, so each examination has its function or 


functions as its underlying basis. ‘The function of an 


c_— fc eS S ee e Se ST S S S S SeOS S K e eSe  K  OS S eSE SB SeeS S E He S SE ST S SSK SSE SO S S SE S  S S 


(1) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 20, Houghton-Mifflin Co., 1930. 


". 
at | 
; 
t 
t 
re 


~~ 
! 1 
et 
Lona 
> 
: 
‘ 
. 
_ 
a 
wel. e 
aoe 
a . 
7s 
baer 
| & 
fe % 
- 
og 
: a 
Sis ) 
+ ‘ 
+ — 
» &3 
7 ayn 
OLA 


» 


a, 

aM 8p ose 8 
va pi 
"ao 

SOL - 
«VG - 4 


he 
ted 
we 


. off SnAg 
e'¢aed <ue 
39 OHLNOSG 
solaa- one 

(am~2Pesd 


lianigaxe 


sia "ATOwW 


ue ereqed 


weunolisie 


eiv Pade 


89. 


examination may be likened to the tiller of a sailboat; 
both give point or direction. The thought prevailing 
in the modern test movement is that extra time spent in 
the planning and actual drafting f the examination is 
more than compensated for in the correction and interpre- 
tation of results. Unfortunately, there is no agreement 
among educational authorities as to the functions that an 
examination is supposed to serve. Lang says, "In any 
Glassification of functions of this kind there will be 
more or less piabicosina tee Each examination is almost 
sure to inciude more than one function. For example, a 
test aiming to determine the achievement status of the pupils 
will also have some diagnostic value, although, of course, | 
not near so much as if the test were intended primarily to 
aim at diagnosis. The following nine examination functions 
are suggested by Professor Lang: 

I. Testing Retention of Information 

Teachers, as a rule, have difficulty in testing in the 

social-business studies; hence, the need for such a study 
as this. All too often the tendency for the classroom 
teacher is to overrate the amount of information the pupil 
has retained after his study of a given unit of work. 
Lang says, "Teachers and students alike are inclined to 
Overestimate the mastery of subject-matter which has been 


4 (3) 
studied." 


(3) Inia, pp. 22. 


The meagre retention of information on the part of the 
pupils after so much supplementary reading and explanation 
is a source of concern to the teacher in the secondary 
school. This problem can be solved in part, at least, by 
giving unit tests systematically and then planning remedial 
instruction on the basis of the results shown. The thought 
that nothing short of complete pupil mastery should be 

the goal has been advocated in certain quarters. The chief 
exponent of this school of thought suggests the following 
"mastery" formula: "Pretest, teach, test the result, adapt 
procedure, teach and test again to the point of actual 
Wesee® grr is generally conceded that there is genuine 
merit to this formula. Whether pupil mastery is the goal, 
Or Whether the teacher is satisfied with results short of 
complete mastery, the efficient teacher must have a definite 
planned program of unit tests combined with a follow-up of 
diagnosis and remedial instruction. 

The results from the check-up test will show, occasion- 
ally, results that are quite alarming to the teacher, i.e., 
that the instruction on a certain unit of work has been 
entirely ineffective. Many times such a condition arises 
at the beginning of the year's work with groups that the 


teacher has not as yet been able to size up. If the school 


= oe ee ee we oe ee ee Se ee ee es ee em ee Oe re ee em Se ce ee ee ee ee ee ee eS ee oe 


(4) Morrison, Henry C., The Practice of Teaching in the 
Secondary Schools, pp. 79, University of Chicago 
Press, 1926. 


*, 
be 
> ~~? 
Et rys - sm) ; ; . 
- } ° f a 
: 2 c 
- : r rts > 
J 2 : 
. . $ a 
7 uy + 
' ¢ 4 
r¥| ; 
os) 
‘ *| i 
: 
a 


“~" 


_ 


i a , 
- i 
‘ , 
' 
< 7 
. 
: >) 
Pr 
- 


employs a psychologist, it would be a good idea to talk 
the problem over with him. He might advise that suitable 
psychological examinations be administered to ascertain the 
intelligence of the group. Moreover, because of the 
psychologist's past training, he could help the teacher 
adjust the pedagogical methods in order to meet the needs 
of this special group. If certain individuals then do not 
respond to this modified instruction, case studies must be 
made of them. 
II. Determination of Achievement Status 

So far we have concerned ourselves with the test 
covering a unit of work. In most cases, there will be the 
further necessity of determining the pupils’ complete 
Status in a subject from time to time. Perhaps the distinc- 
tion between the two types of tests may be made clear if 
the first is called a "check-up" test and the second,.an 
"inventory test". The former is intended to measure the 
retention of information relative to a unit of work; the 
latter emphasizes all that has been attained up to a given 
time in a subject of study. 

It is evident from the foregoing explanation that 
this second function of examinations is one of the most 
important; in fact, according to some educational authori- 


ties, the most important examination function. Professor 


Ruch holds that: "The measurement of er ak been 
5) 


ev 


admittedly the principal reason for examinations." 


Examinations adminstered to determine the achievement status 
of pupils in a certain subject or in an entire school 
system have sometimes unearthed startling deficiencies in 
the present educational program. These results have been 
seized upon by carping critics of the secondary schools 
who have been delighted to herald them far and wide. Lang 
says, "It is from the standpoint of determining achievement 
status that examinations have been submitted to the most 
severe censure and the most unfriendly sbtadigie® 

The results from achievement tests are not only in- 
Valuable to the classroom teacher in judging the effective- 
ness of his instruction, but also of extreme value to the 
executives in bringing about the proper orientation of 
pupils. Suppose that a pupil transfers from a small, 
unknown secondary school to a large, up-to-date city high 
school. Needless to say, a question would arise as to 
whether the pupil's development was commensurate with the 
new strain to which he would be subjected. The unaided 
judgment of the executives alone would be unreliable in 
the grade-classification of such a pupil. This dilemma 
could be solved by testing the pupil as to his development 


in the new subjects for which he has enrolled. If he falls 


(5) Ruch, G. M., The Objective or New-Type Examination, 
pp. 16, Scott, Foresman and Co., 1929. 


(6) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 26, Houghton-Mifflin Co., 1930. 


en 


& . 
on i 
5a 2 


~—-=-—— 


down markedly in any of the tests, it is safe to assume 


that he has an insufficient background in that subject or 
subjects and his educational guidance will have to be 
varied accordingly. The above conclusion, of course, is 
predicated upon the premise that searching "inventory" 
tests are administered in the subjects under question. 
The other uses of achievement tests are legion. 

Another type of achievement test that has received 
wide publicity in recent years is the "pre-test". Lang 
says, “Many teachers now give a pre-test before assigning 
a& new unit of ares discussing the need of a testing 
program, Professor R. G. Walters says, “Achievement tests 
Should do three things: discover what the student knows; 
how well he understands what he knows; and how well he can 
apply what he knows. In other words the achievement tests 
Should consist of fact, thought, and application ileal 
After the teacher gives a pre-test on a new unit of work, 
he obtains the information necessary to plan his development 
of the subject. 

Needless drill on the phases of the unit with which 
ali the pupils are familiar would not only be deadening to 


pupil interest but also be an insensate squandering of 


time. The Morrison Unit Plan presupposes the giving of a 


(Fi: Ibid, pp. 27. 


(8) Walters, R. G., Modern Methods of Teaching Commercial 
Subjects, Monograph No. 16, pp. 23, South-Western 
Publishing Co., 1932. 


7, oe 


ite -~< 
uy : 
= © 
WG : 
‘ = 


94 


pre-test in the "exploratory" stage of a unit for the 
reason mentioned above. The main values of the pre-test 
is that it indicates the foundational preparation of the 
pupils for a new unit of work. 

The general survey test is yet another achievement 
test that enables school work to be evaluated. This type 
of test frequently embraces the schools in an entire 
system. The Rochester School Survey in Senior High School 
Social Studies may be cited in this connection. This 
Survey, Starting in September, 1925, extended over a period 
of two yearse A number of valuable conclusions were 
arrived at as a result of this cooperative testing movement. 
"The starting of the survey work really put under way six 
important movements in the field or the senior high school 
social science: re-weighing and clarifying of objectives; 
re-organization of subject-matter in the light of these 
Clearer Objectives; re-organization of classroom procedure 
for the betterment of the chosen aims; better articulation 
between departments both in the same schools, and among the 
different schools, and between different school levels; 
greater knowledge of how to test scientifically, and to 
compute and use statistical resulis; finally, the establish- 


(9) 
ment of more scientific remedial work." It is evident that 


(9) Gibbons, A. N., Tests in the Social Studies. A 
Record of a esting Experience in the Senior High 
School Social Studies. National Council for Social 
Studies, pp. 7, Athens Press, 1929. 


0 


cooperative testing program was planned carefully by 


committees of teachers. The first year was to be given 
over to study and experimentation in the individual 
schools; the second, to the administration of a series 
of uniform city-wide survey tests. Mr. Gibdbons 
Summarizes the values from the first year thus: "Possibly 
the greatest value that came from that first year concen- 
trated upon experimental factual testing was the unanimous 
conviction that the objective of factual mastery must be 
subordinated to higher aa le aes general values 
accruing from a self-survey of the Rochester type are 
bound to react toward a raising of professional teaching 
standards through-out the system. 

That the achievement survey has vital significance 
to the teaching staff is the contention of most authorities. 
Professor Van Wagenen maintains the following: "The most 
important purpose of an achievement survey probably consists 
in acquainting the teachers with the actual educational 
conditions existing in the school system, the realization 
of its strong and weak points as well as its present 
standards of Mesa 

III. Stimulation of Daily Work 


Motivation is a term that looms large in the profes-— 


Sional teaching literature of the day. A lesson properly 


(20) bid, pp..19. 


(11) Van Wagenen, M. J., Educational Diagnosis and the 
Measurement of School Achievement, pp. 224, 


the MacMillan Co., 1926. 


eee 


motivated will go over with a minimum of troubdle, and, 


conversely, one lacking in motivation may be fraught 
with petty annoyances or even serious discipline problems. 
How to get this all-important factor marks the difference 
between an excellent teacher and a mediocre one. The 
ideal situation would be where the pupils are stimulated 
to prepare their lessons adequately because of real joy 
derived from the work itself, or because there is a 
realization of its probable usefulness. Lang says, "As 
every teacher knows, however, ideal and natural motives 
do not always make a strong enough appeal to students to 
stimulate adequately a satisfactory type of daily prepara- 
Ce ns this is true, the efficient teacher many times 
will have to resort to extrinsic motivation. At all 
times the motivation should be kept as closely associated 
with the activity itself as possible. As we know, the 
incentive for study on the part of the pupils should be 
intrinsic in nature, viz., should arise from the interest 
and joy the pupil has in his studies; lacking this, the 
promise of a test or an examination, even though not so 
perfect a motivation, should produce the desired result of 
adequate lesson preparation. 

Tests and examinations are effective means of stim- 


ulating school work. Pupils in the secondary school are 


(12) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 27, Houghton-Mifflin Co., 1950. 


~ @, 
7 << s J he 
7% . = op 
“vip pees 
aie 
3, 
‘ 
my J] 4 
4 7 gs 
fhy -— » 2 
.2.4 98 it avigt 
+ ‘ ‘ | 
aa . . 


“ f { 


~~ ‘ 
+ bay 
2 ~ 
ey: 
Pa ns 
- 
- 
-_~. 
“ 
yea" £ 
- 
+» - Pras * 
ss aS 


—= =o Cer 

- - a Bt anlite ts 
o 

~——- ~~ 2 we Cee = ee oe 

- F ; ad ~~ > 

i- + BiB = 
so. 
: ; -2hodg 


ae i 


already familiar with tests and examinations of all types 
because or their previous educational experiences in the 
lower schools. Professor Ruch says, "That examinations do 
have this value has been tacitly agreed but never proved. 
In spite of this dearth of proved fact, it does seem 
reasonable to suppose that pupils strive for somewhat 


greater and somewhat more permanent mastery when they 


realize that searching examinations may be expected at a 


(13) 
later date." The test or examination could be increased 


in value as a motivator if certain cardinal rules and 
principles were constantly borne in mind. The first rule 
is that pupils shonid have knowledge of the progress they 
are making. If a test or examination is given the ordinary 
class, the pupils are on "pins and needles” to learn what 
score or grade they obtained. In order to utilize this 
interest, the test should be corrected and scored while 
this feeling is at a high pitch. This is the time when 
remedial work will reap its richest harvest. This is a 
potent argument for the giving of short daily tests, having 
the pupils exchange papers, and having the correcting done 
by the pupils themselves. If the teacher then calls for 

@ show of hands as to the items missed, he has a good 


basis for a remedial lesson with the pupils keyed to the 


SSS Ss SSeS SSS Se See SOS SSS eS SS SE SE Ee Oe SSE SE SSE SS eS SS ee ee ee oe 


(13) Ruch, G. M., The Objective or New-Type Examination, 
pp. 10, Scott, Foresman and Co., 1929. 


proper degree of receptiveness. Carlson argues that: 


"Experimental psychology has repeatedly demonstrated that 
pupils do much better when they are kept informed of the 
results than when they are kept in ignorance of their 
degree or SAT experimenting with his own classes, 
the writer has found that the posting of test scores on 
the bulletin board in the form of a barograph has aroused 
great interest. The length of each barograph indicates 
the total accumulated score for each pupil. 

The second rule for increasing the motivating power 
of tests and examinations is that they should come at 
frequent intervals - a planned sequence should be arranged. 
To give long tests infrequently is to delay the day of 
reckoning so long that the pupils are not stimulated. 
Carlson suggests that: "A twenty-five question, short- 
answer test need not consume more than five minutes of 
the class period for writing and ten minutes for correction 
and tabulation of beasbial tla should be evident that the 
giving of such a short test would dispense with the need 
for a great deal of oral quizzing on the part of the 
teacher. 

That pupils may be made to realize the help they can 
Obtain by using tests of a detailed specific, and diagnos- 


tic character is a third important rule. At the outset, 


(14) Carlson, Paul A., The Measurement of Business 
Education, pp. 8, South-Western Publishing Co., 1932. 


(15)..Ibid, pp. 8. 


4 


the pupils wili regard diagnostic tests as drudgery, but 


their viewpoint will change when they begin to realize 
the value of these tests. 
IV. Motivation of Reviews 

The frequent review is an integral part of good 
teaching. The teacher who fails to allow sufficient time 
for review work is committing a grave error since he is 
not taking into consideration the psychological law of 
forgetting. Lang says, “Learning no sooner takes place 
than the process of forgetting sets in. The tendency to 
forget is a natural process and a fortunate one. Many 
trivial things and much that is not true are learned. It 
would be a great handicap to learning if the mind were 
permanently cluttered up with such 6 A ae 
getting is a good thing because the mind is cleared of 
Many untruths that have been learned, and, further, 
because it makes possible the selection and organization 
of the particular material that it is desirable to make a 
permanent possession. 

The examination is an effective means for the motiva- 
tion of review work. The proper kind of review requires 
the pupil to go over his material carefully, sifting out 
the salient points from the unimportant, and then organiz- 


ing them in some sort of usable outline. Such a review 


SF eee Se SS Oe SP eS Se Se SE OS SOS Se Ow KH SE Ee SS Se ee ee ee ee ee ee 


(16) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 30, Houghton-Mifflin Co., 19350. 


eS 


’ 
¥ 
* Br 
. “s 
° 
: é 
t * 
. 
. 
¢ . r) 
- 
Ps ~ 
1 
ie 


entails a critical and deliberate evaluation of the 


material. It is not to be confused with a hasty and 
Superficial type of mental activity known as cramming. 
Ve Provision of Objective Standards 

How many times have teachers complained that their 
classes this year are much poorer than those of preceding 
years? All too often such an assertion is made to cover 
up poor teaching on the part of the teacher. In most 
cases, the teacher bemoaning the poor abilities of his 
pupils, is drawing this conclusion from his own experiences. 
Such an unscientific judgment may or may not be true, with 
the possibilities being against it. The teacher usually 
falls back upon general impressions; such generalities 
are more or less meaningless. To make a real comparison 
of the work from year to pour) eegeetive tests would have 
to be administered to each group in turn, and the results 
compared with the norms or standards that have been derived. 
Any judgment rendered then would be founded on actual 
achievement as determined by suitable objective tests. 

The accumulation of examination scores over a period 
of years on the various tests in a well-rounded testing 
program are a great aid to the teacher in judging the 


effectiveness of his instruction. Occasionally a group 


_ _ - 3 —- 
a n j : “ : . . > 7 ; 
- = » oy — - : : ' 
a " eo = a 
ae > i « ; 4 
r 4 > ‘ ” : 
= ‘ - A 
a ms 
rod ‘ ‘ : 
; 
ry 


es 


will be met with that falls far below the established 


Standard. Such a situation is really a challenge as 
Ordinary instructional methods will be ineffective. 
The fact that this class has failed to measure up to 
the standard is an index that the entire course will 
have to be revamped in order to meet their needs. If 
the teacher did not early discover that this group was 
Sub-normal, he might blame himself for poor teaching 
When the standard pedagogical methods failed. 

4n attitude of complete slavishness to norms or 
standards is not a good thing. As has been shown else- 
where in eis thesis, the method of deriving the norm 
must be examined critically. When a test with norms is 
used, the teacher must emphasize the points that appear 
in it otherwise the pupils will not attain the norm level. 
In some instances, tests do include obsolete material. 
Lang concludes that: “The use of standardized test norms 
as Objectives tends to perpetuate obsolete material and 
to fix a traditional school pedicccudiat, 

VI. Measurement of Teaching Efficiency 

The teachers and supervisors both realize that teach- 
ing efficiency must be gauged if the vocation is to be 
really put on a professional basis. The question still 


persists as to how this can be accomplished. Protagonists 


Poa. Seid, pp. SS. 


of the new-type test immediately proclaim that their 


teaching procedures and methods provide the needed succor. 
At any rate, it is now generally agreed that the old 
impressionistic methods of rating the teacher after a 
five minute sampling of his work should be consigned to 
the museum of educational curios, never again to be 
Salvaged for use. Such methods were astounding in their 
finality; unfortunately, they still persist in our 
educational structure. Subjective ratings of any kind 
are to be regarded with suspicion; and in the case of a 
rating based on a sampling of five minutes’ work, they are 
doubly so. Something unforseen might arise during the 
sampling period that might be entirely unrepresentative 
of the real work being accomplished in the course. If the 
Se estacr placed much significance in the sample, it would 
result in a grave injustice to the teacher and a direct 
blow at his professional standing. Supervision should be 
conducted systematically if it is to serve its main 
purpose, viz., the improvement of teaching efficiency. 
Does the above indictment of subjective teacher- 
rating methods mean that there is to be no supervision? 
Positively not: There is a real need for the right kind 
of supervision. A supervisor to attain his full degree 


of usefulness must drop the role of. being a critic or 


ee ee oe oe oe oe ow SS > OD oe eS oe ee Ee SS ee a ee ee Se ee Oe eS Se ee SE se Se Se SE ee Oe ee ee Se ee oe oe oe oe a ee oe 


stern judge and assume the part of a friendly counselor 


who by his past educational background and experience is 

qualified to offer guidance to the teacher of such a 

nature that the latter's efforts will be more successful. 
The problem of evaluating the efficiency of the 

teacher by tests of his pupils’ accomplishments is worthy 

of serious consideration. Ruch says, "It was on this 

point that Dr. J. M. Rice drew so much fire from the 

National Education Association at the beginning of the 

eee a, prevalent thought was that the results of 

standard tests administered to the classes of different 

teachers would indicate relative teaching efficiencies. 

This theory assumed that high accomplishment on standard ; , 

tests indicated effective teaching, and low accomplishment, - 

per se, unsatisfactory teaching. The weaknesses in this 

point of view were soon exposed. Ruch holds that: "The 

standard test method made no allowances for differences in 

pupils’ mental equipment, the most important single factor 

controlling the rate of learning yet cea” 4 standard 

test purporting to test pupil accomplishment in Commercial 

Law would, of necessity, yield lower scores with a low- 


ability group than with a high-ability one. Relative 


teaching efficiencies of two teachers can be determined 


(18) Ruch, G. M., The Objective or New-Type Examination, 
pp. 12, Scott, Foresman and Co., 1929. 


P29). 2hida; pp: 12: 


only if they are working with groups of about the same 


ability. It is only after the school psychologist has 
proved that two groups have approximately the same mental 
abilities that scores on standard tests can be compared 
fairly. Then, again, it must be realized that standard 
tests are not entirely adaptable to local conditions, 
and that they are open to abuses through coaching. For 
that reason, it is probable that the Commercial Department 
Head should take advantage of locally constructed tests 
in solving the supervision problem. Professor Ruch con- 
Gludes that: "Where objectives and aims can be translated 
into concrete test situations, supervision through locally 
constructed tests is far more economical than personal 
ae, 
VII. Improvement of Teaching Efficiency 

Because a teacher's professional experience extends 
Over many years does not necessarily prove that the 
teacher need not concern himself with improving his 
efficiency. It is possible that this extended period of 
Service has merely resulted in fixing habits that were 
faulty in respect to sound educational practice. Even 
if the teacher is thoroughly efficient, he must keep 


abreast of changes in the field, and adapt his methods to 


Ieet the changing situation. Usually this can ve 


(20) Ibid, pp. 13. 


accomplished by three means, viz.: reading the best 


professional magazines, taking courses in the graduate 
schools of great universities, and attending teaching 
institutes and conventions. One of the momentous 
problems that new superintendents face in coming into a 
school system is how to improve the teacher in service. 
The expression that the teacher "has gone to seed" is used 
quite a lot in teaching circles. This type of individual 
must be helped to find himself, or he must be eliminated 
from the system. It is a moot point whether teacher 
tenure does not build up in the teacher the false outlook 
that he does not have to concern himself with self- 
improvement as his position is secure. Lang says, 
"Improvement of efficiency through-out the teaching career 
Should be the constant concern of every member of the 
IS a 

It, is only natural that the superintendent should be 
concerned with the problem of supervising the teacher for 
the purpose of helping the latter to gain greater teaching 
effectiveness. Supervision is of such great importance 
that it must be carried on systematically and with a 
minimum of friction. Professor Van Wagenen says, "The 
effectiveness of supervision in a school system is 


dependent on several important factors: the confidence 


(21) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 37, Houghton-Mifflin Co., 1950. 


106 


of the teacher in the superintendent's or supervisor's 
professional integrity, his range of professional 
information, and his ability to show the teacher what 
needs be nae The superintendent's efforts will be 
abortive until he gains the goodwill and confidence of 
his stafi. Changes suggested by the superintendent as 
a result of his class visitations should be reasonable 
and based on logic apparent to the teacher. 

In his systematic visitations, the superintendent 
Or Supervisor will have to use some teacher-rating method. 
Alberty and Thayer say, "The motive for introducing teacher- 
rating schemes is clear. It is to substitute an objective 
and accurate appraisal of teaching success for the old 
method of general ee aac Suggest three general 
types of teacher-rating plans: "score cards of teacher 
traits, man-to-man comparison scales, and measurements of 
teacher efficiency based upon ae tke score 
card is probably the most widely followed plan for rating 
teachers. It consists of a number of traits that are 
essential in good teaching, listed so that each one can 
be appraised separately. Needless to say, a supervisor 
must have long experience in rating teaching before he can 
use the score card or man-to-man comparison scales with the 
(22) Van Wagenen, M. J., Educational Diagnosis and The 

Measurement of School Achievement, pp. 68, 


the MacMillan Co., 1926. 


(23) Alberty, H. R., and Thayer, V. T., Supervision in 
the Secondary < eheaee 142, De C. Beath & Co., 1931. 


(24) Ibid, pp. 143. 


Bae ++ ~~ 


proper degree of success. The third plan, the measurement 


of teaching effectiveness by the use of achievement tests, 
is of prime importance when viewed in the light of the 
modern test movement. The present writer contends that 
Supervision would be greatly facilitated by the setting 

up in the secondary school of a well-rounded testing program 
consisting of both teacher-made and standardized objective 
tests. The results of these tests, carefully interpreted, 
should prove a boon to the busy superintendent. 

That examinations afford great possibilities for 
improving teaching efficiency is now being recognized by 
the entire teaching profession. There are a number of 
ramifications of this thought. In the first place examina- 
tions are conducive to teaching efficiency since they 
supply the teacher with a knowledge of a student's 
achievement-status, and a detter understanding of his 
shortcomings and difficulties. With this information, the 
teacher is better able to adapt his work to pupil needs 
and interests. Secondly, the construction of a good exam- 
ination compels a teacher to determine the objectives of 
the course and of the different units of work, and to 
organize the subject-matter so that these objectives will 
be attained. All this presupposes thorough study by the 


teacher of the instructional materials and should result 


in a greater thoroughness of preparation. Gale Smith 


says, "The formulation of odjective tests compels a 
teacher to study, to organize, and to plan her work to 
the most minute A sai. aia ort an intimate knowledge of 
the subject-matter is almost certain to result in improved 
teacher presentation. 
The advantages of "unit teaching" have been extolled 
by many leading authorities. Gale Smith says, "For 
Objective tests to be most effective as a supervisory agency, 
the subject-matter being taught must be broken up into 
SMAll UNitSeccescoceecceeeee Any Sudject naturally divides 
itself into units corresponding to the different topics 
and sub-topics which it includes. This natural division of 
subject-matter should be the foundation for the work se 
It is always hard for an inexperienced teacher to plan the 
units of work. As this bears upon the efficiency of 
teaching, it is the supervisor's duty to assist in the 
laying out of these teaching units. Until this is done, no 
testing program worthy of consideration can possibly result. 
In supervisory work, provision should be made for the 
Supervision of testing. In some school systems with which 
the present writer has been connected, and in some that he 
has eet ae there is no supervision of the tests that the 
(25) Smith, Gale, How to Construct and Use Non-Standardized 
Objective Tests, pp. 114, The Benton Review Shop, 1929. 
$84) Tbid, pp. 115. 


individual teacher gives. After all, the supervisory 


program embraces the supervision of testing as well as 
the supervision of teaching. Hildreth calls attention 
to this in the following statement: "Permitting teachers 
to choose and use any tests indiscriminately is indefens- 
ible and may result in great economic waste to the school. 
Supervision of testing is as important as supervision of 
instruction........eee- Although in the best schools 
opportunity is always allowed for the exercise of the 
teacher's discretion in such matters, there is always some 
Supervision to relate the work of each particular teacher 
to the activities of the whole school, thus insuring proper 
integration and uniformity in the Ek 

VIII. Diagnosis of Special Difficulties 

Diagnostic testing and remedial teaching are the two 

pillars of strength that support the whole superstructure 
of classroom instruction. Effective teaching requires the 
diagnosis of special difficulties in learning. In this 
respect, the work of a teacher is much like that of a 
Physician. The former must know the characteristics of 
learning difficulties and their remedies; the latter, the 
Sieh niee of common diseases and their cures. Just as the 
skilful practitioner probes his patient's ailment, so too 


the resourceful teacher critically scrutinizes his pupil's 


(27) Hildreth, Gertrude H., Psychological Service for 
‘ School Problems, pp. 64, World Book Co., 1930. 


- 
’ tg 
' 
~~ 
7 
J 
Od 
- 
ar 
Wa 
e! 
> 


aye. 


difficulty in order to determine the reason for its exist- 


encee No competent physician would attempt to cure an 
ailment before he was reasonably certain what he was up 
against. Similarly, no progressive teacher would prescribe 
mental pills for his pupils until he had sized up the 
situation. " The first act in the teaching process should 
be diagnostic testing, in order that the teacher may know 
what to peer ee test is called a pre-test by many 
educational authorities. Professor Brueckner says, “The 
theory back of this procedure is the same as that back of 
all diagnosis, namely, the orientation of teaching Sedan 
The results from a diagnostic test not only indicates 
the progress a class is making but they also show which 
Members of the class are not profiting by the instruction. 
This information is of great importance to the teacher. If 
the scores of the test are unusually low for all the students, 
it is a pretty good index that the instructional methods 
being used are not suitable. If the scores indicate, how- 
Special study must be made of the pupils obtaining these 
low scores in order to find out the trouble. Gale Smith 
Suggests the following method of summarizing the results in 
a teacher-made objective test. Let us take as an illustration 
the scores on a Commercial Geography test administered in 
Stamford High School. 


(28) Carlson, Paul A., The Measurement of Business 
Education, pp. 7, South-Western Publishing Co., 19%c. 
(29) Brueckner, Leo J., and Melby, Ernest 0., Diagnostic 
- @nd Remedial Yeaching, pp. 451, Houghton-Mifflin Co., 1931. 


o yidgnoeses acd ocd 


ideriae fon’ 8 
s five 


4 ia 
- 2 tine 


a Ps -—— Or get as an GAY ay ae wen ated 
tank to trea jeteaseH: oh vi & 
. as idod, ngetasBen spcB. 3 

tmonse...% étold | as, 


The summarized results given on the diagnostic sheet 


represent the scores obtained by a class of twenty-eight 
pupils. Even a cursory examination of the results will 
indicate that the questions represent varying degrees of 
difficulty. For example, questions five, eight, fifteen, 
and twenty-nine were hard for the pupils as shown by the 
large number of failures. This is good evidence that the 
teacher should reteach these particular points in his 
next lesson. 

It is quite probable that some teachers would object 
to the use of the Analysis Sheet that Gale Smith proposes 
on the grounds that it is too time consuming. The prepara- 
tion of such a chart does not take much time, but it might 
take more than can be allowed by some teachers. If so, the 
best procedure is to dispense with it and to ootain the 
needed information by other means. Some teachers do the 
work in the following manner. First, they dictate the 
diagnostic test allowing the pupils sufficient time to 
answer each question. Next they direct the pupils to ex- 
Ghange papers. The teacher then reads off the correct 
answers. After this, the pupil counts up the number of 


items correct and indicates the number on the top of the 


test; usually, he initials the test that he has just corrected. 


When this is done, the teacher calls for a showing of hands 


a 


a ai. 


Sy SL l-ie ee CT 


A> Mill A 
er iy 
ie 


_ 


J 
+t 


any 


4 
fie 


s. don : boy 
pul ony Ny 
Pe Pay 


» 


| ont 


th: 


oo ba ee ee re ae ee 


as to the errors on each test item, and the total is written 
on the front board after the number of the item. Now, with 
the information before them, the class are able to attack 
the items that were missed. Professor Carlson says, "Unless 
the results of each test are followed up, they are not 
worth the time and trouble which they heres 

The tendency toward the use of the Morrison "Mastery 
Formula" seems to be gaining.momentum. From the standpoint 
of waste elimination, this is a healthy trend. Apropos of 
this, Gale Smith says, "Recent research has tended to 
indicate, beyond doubt, that there has been entirely too 
much needless repetition of teaching. It is not necessary 
to teach again what the pupils already know. Our greatest 
waste in teaching comes, not on account of teaching retarded 
pupils who need it but on account of reteaching those who 
could be accelerexted, and who do not need repetition. The 
method of testing, diagnosis, reteaching, follow-up work 
and retesting which is suggested will conserve time and 
reduce waste in teaching egaien 

In addition to the diagnostic work on ordinary tests, 
it is sometimes advisable to give tests that are avowedly 
even more diagnostic in nature. These should be focused 


on the different units of work where weaknesses crop up. 


se ce eS Ee Se Se eS SE See eS eS SE SE Ke eS EE eS Oe eS Owe Oe Oe EE SS SE eS SE SE SS eS eS 


(30) Carlson, Paul A., The Measurement of Business 
Education, pp. 7, South-Western Publishing Co., 1932. 


(31) Smith, Gale, How to Construct and Use Non-Standardized 
Objective Tests, pp. 113, The Benton Review Shop, 1929. 


aan ‘gaat, om 
pa wg: 
a ag Aca8 3 te es 


y 


~ IBOBT bie, 


eriant toad 


io 


Sn ih avidoned St 


eal fie 

- _ is hy 

Pot 9 oe oat tatt 3 * 
Sti 

rt r,s PT 

ihe 


ay 


Perhaps a distinction should be made vetween the 


survey test and the diagnostic test at this point. The 
first indicates what has not been learned; the second, why 
it has not been learned. Odell says, "The tests employed, 
whether standardized or home-made, should in so far as 
possible not merely show what errors pupils make or what 
gaps there are in their knowledge, but why the errors are 
made or the gaps serue ne s. diagnostic test will be, of 
necessity, much more detailed then the survey test. In 
most diagnostic tests several questions or exercises are 
inserted that deal with the same facts or processes on the 
theory that the pupil's deficiencies will be shown up more 
Clearly. Odell says, “Another quality of satisfactory 
diagnostic tests is that they should frequently contain 
several questions or exercises dealing with the same facts 
Or processes. The purpose of this is that teachers may 
know from the results obtained whether or not pupils really 
know or do not know the points daboteeae 

When the teacher comes in contact with an obdviously 
mal-adjusted pupil, he must take advantage of his professional 
experience in order to ascertain the exact nature of the 


trouble. It is right at this point that the inexperienced 


teacher falls down. An experienced teacher knows much more 


(32) Odell, C. W., Educational Measurement in High School, 
pp. 550, the Century Co., 19350. 


(33) Ibid, pp. 550. 


v achat | my, 


ft f rasa Lt a at 3 
- ~ = aaa 4c ah, | 
ame OF re  Seoumed fe ae aed: tt 


<2 beatbuat tase par 


? t 
~ = 
: * " 
i 
we ee 
% 
— 
vv 
all - 
- 77? vice 
L% oS. & ot Qa 
~ " a 
4 ’ g 
- 
f Sad Bs 
: a —s * 
‘ ah aa 
t” 
’ 
. 
im - - ee 
> * ee” ee 
*, . 
oa 


see .4 
- =z . a fx ' ¢ ie wad 
er ” 
at 
aa no redosat ent act oy ae 
r¢ 2 Ph i a nec 


=. ; . ‘ . I ox. ae batan 


= Pe 3 | 8 Gab at sone!usg 


digit et tt ie 


ob 
— 
re 


i ay Ws 


.¢ , ; =f 
i pai 


about the physical causes of poor school work than his less- 


experienced colleague. Often times a teacher with little 
or no professional training is prone to brand a failing 
pupil as moronic because the latter does not learn readily. 
In reality, the cause of the pupil's poor progress might be 
due to something entirely remote. Hildreth says, “McCall 
has found the chief causes of many of the learning difficult- 
ies of school children to be insufficient practice, improper 
methods of work, deficiency in fundamental skills, absence 
of interest, physical defects, and subnormal ee siicake 
If the pupil's learning difficulties are so deep-seated that 
they are not discovered by the Mehisars diagnostic methods 
employed by the teacher, special clinical methods will have 
to be used and the aid of the school psychologist enlisted. 
Little has been said so far about specific diagnostic 
testing methods in the social-business subjects. Little 
can be said, really, because of the limited sources of 
information available. What has been written about diagnos- 
tic testing applies to all subjects with the same force as 
to the social=-business group. Diagnostic testing and 
remedial teaching are harder with the social subjects than 
With other subjects. Brueckner and Melby say, "It is 
evident from the foregoing discussion that remedial work 


(34) Hildreth, Gertrude H., Psychological Service for 
School Problems, pp. 147, World Book Co., 1930. 


okies, aaa Uttive 
pate 


a. ; , . 4 ~ ~ 3 ty p € “4 oc whe Brae) palaiton 


be 


Pay = 
Any 
- 
* 
. ; ~~ rie 
7 f ; i 4 : ? VS . ee W420 
ca 
_ . oor 
a " OF sioiiab: 
e P ¢ , 
i> 
. ~ 
; ' a oy. > Jeeta 


ve 


‘ «+ 


ee. ; : ; ; 
é i q 7 ps = Ia ef 


<Logevod: ene mos 


Pah! 
~~ - 2 = ee ee 
+ 


in the social studies becomes_a more difficult and less 


exact procedure than would be the case in a subject such 

as be tadtictt cto all, the difficult thing in relation 
to teaching subjects in the social-study group is that there 
is no agreement among authorities as to the facts to be 
taught. Brueckner and Melby point out that: "It is 
readily apparent that the test-maker faces a more or less 
baffling secagvaea in this field. In the first place there 
is no complete agreement on the facts to be nddedine! 
This visits quite a hardship on the teacher as he must make 
the decision as to what facts should be included. Here, 
again, an immature teacher is at a decided handicap as he 
has not sufficient background to judge between the relative 
merits of different teaching materials. 

More and more the use of varied teaching materials is 
being stressed in the social-business subjects. It is 
considered sound teaching procedure now to require the 
pupils to consult many sources of information other than 
their own textbook. After they get accustomed to this 
library research work, the pupils really like it. A great 
many pupils develope a real love for books as a result of 
this preliminary training. Kimmel says, "The program of 


Wider reading in the social studies depends largely upon the 


degree to which teachers are successful in arousing the 


(35) Brueckner, Leo J. and Melby, Ernest 0., Diagnostic 
and Remedial Teaching, pp. 475, Houghton-Mifflin Co.,1931. 


(36) Ibid, pp. 448. 


a3 de Ms + COS aq < . wed ‘etbel oe a 


Bo) eh 
. o: 


~ 


) >. ey 
4 oe . idefsatl 2 efiep etiety & 


™ Wie: 


: - a Reed, 7% 
ay . vrei Weic ser Rs Roce ted Lah 
mae : . = ligag ett ,2408 dow eds % ei PER 
on * i rae big an 


te | me 4et B sqoley vee. ooihiin: 


= 


*$ : aAG TSH ti it SBE6% 


interest of pupils in reading...e.eeeeee2 The cultivation 


of a taste for worthwhile books is one of the most important 
goals of the efforts of teachers of the social tealaeua 
It is in reference to the reading program that the 
teacher in the social-business subjects is thrown upon his 
mettle. How to discover and single out the pupils that 
employ faulty reading methods is a problem with which the 
progressive teacher must cope. Kimmel says, "New-type tests 
furnish an essential part of the procedure in the develop- 
ment of a program of remedial instruction in cases where 
pupils have failed to gain the reading ies IN a 
type reading tests have proved a wonderful source of help 
in aiding the teacher to find out within a short time the 
pupils who really need special assistance in the develop- 
ment of correct reading habits. Johnson says, "The 
ability to read eifectively is perhaps the most important 
Single factor in success in those subjects which require 
the use of books. A constructive program of supervision 
Should include a test of silent eb Pe ty Failure by the 
teacher to discover the pupils who cannot read properly 
Will many times resuit in the pupil's complete retardation. 


This situation is inexcusable; after all, there is a 


teacher responsibility wrapped up in every pupil failure. 


— oe Se ee Se ee Se oe SP ee ee Fe ee ee ee ee ee ee eee ee ee ee ee ee a ee ee ee ee 


(37) Kimmel, William Glenn, The Management of the Reading 
Program in the Social Studies, Publications of the 
National Council for the Social Studies, pp. 24, 
McKinley Publishing Co., October, 1929. 

(38) Ibid, pp. 64. 

(39) Johnson, Franklin W., Administration and Supervision 
of the High School, pp. 389, Ginn & Co., 1925. 


note t 
; mi Se 


s heyotg eved Sdea7 


Gita Utes GBR Oey at 


4 <tuve BF op beqges® wheel siseeaeinti 


cht is ee ss aoe n ow wot oh inh ty i RIN aS 


ude aise, 
, 

? 
bh 

oy» 

iL 


ils” 


Ix. Cultivation of Intellectual Powers 

Examinations tend to build up within the individuals 
taking them certain intellectual habits that are important 
in everyday living. In an examination a pupil is thrown 
entirely on his own resources. Such a situation gives 
important training in concentration and self-reliance. 
The examinee is under the immediate necessity of mustering 
all the information he has acquired about a given topic or 
question and organizing it in his mind. Then, he must 
sift the information to obtain the particular part or parts 
that bear upon the question. All of this represents train- 
ing that should prove invaluable in the development of the 
average pupil. Lang includes the following list of mental 
powers: “Application, concentration, persistence, self- 
reliance, and eaveiawenouste' ys is evident that the 
Cultivation of these intellectual habits is an important 


phase of the school work pertaining to testing. 


(40) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 42, Houghton-Mifflin Co., 1930. 


SSDS bdey NG Bic a 
5 4/ 


; cunt fe tRise bi Lies tosoped Hk gatatart: 


7 
a tier 4 Le. ak ae fotivpos tai ed folraemrmecee 

' \ 7 a 
bas 


Ls 


Ll si Sate nee 


ed ae ‘, aN ah 0 » oo we .* > Sa > a 


e od ~ 

= 
es ; 
oF 


G. PROCEDURES FOR DRAFTING NEW-TYPE TESTS 


The combination test is given a prominent place in 
any treatise on tests and measurements because it is made 
up of both recognition and recall tests. lang says, 
"Frequently it is called a battery test, because of the 
arrangement of a number of similar testing devices in 
groups or sets for producing a united measurement 
It is clear that the combination or battery test is 
suitable for the longer and more comprehensive term and 
final examinations. A more adequate sampling of the 
instructional materials is Obtained because of this greater 
length, hence the test reliability is increased. The 
present writer proposes, in this chapter, to consider first 
the method for drawing up a combination test, and secondly, 
the principles and rules underlying the preparation of 
its different parts, i.e., the true-false section, the 
completion section, the matching section, the multiple- 
choice section, etc. It must be borne in mind that the 
various tests included in the combination test will vary 
depending upon the test-maker's reaction as to what test 
forms are appropriate. In some cases, the combination test 
will consist of true-false, completion, and matching sec- 


tions; in others, the multiple-choice or other test forms 


Will be included. The test forms included will depend on four 


(1) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 184, Houghton-Mifflin Co., 1930. 


eyery 


em 


ei tao. st $2) we am 


as 5 ki ng 8 ni srse ty 1 
a | 
_ § + if & 
_ ~ Ue q HO. J SELGROG Biv ‘¢ 
ys » 
it 
¥ - 7 
; j / ae resoao lL, om 


Pewee =. 
a fe to. ¢ | o ef elelnrotem Lanoltona 


3 ; ton P ba tdaiiet Peod off eonen 17% ie 


Pe aU il 
+ - : " Pp ' ‘s ; 
. 

a 
® 

> 

8 

4 


uf ,peeoqoag terior 


y f ye 
* 

7 

a 

ro 

gt 

oS 

E 

& 


Ty 
i] 


ot ae 
4 
¢ 
~ 
° 
> 
b 
~ 
<q 
+ 
2 


§ 


- 7 inotam one soiiees no keel 
ee 9) 
+1 .ode Ottooe plade 
a es ae 


i febslont atees enolss 


ey 4 
A 
- 


4 
ha. 8 “Er, 
i 
’ 
. 
4 
& 
. 
G1 z 


f 
e 


Ww 


oe ae SRA, eT ake 
-. * 4 [ wre 
es 
K 
} 


L PPE ye ae é ib apontek: ica: (oe ome i Lie am 
«Cots bia 


oa a 


; sit ce nm 


factors; viz. the teacher's personal preference, the nature 


of the subject-matter, the method py which the test is to 
be given, and the main function the test is expected to 
perform. 

At this point, let us consider briefly the advantages 
of the combination test. In the first place, test construct- 
ion is made easy by the use of a combination test. All 
test-makers have more or less difficulty in moulding the 
Subject-matter into certain test forms. If a variety of 
test forms are to be used, there are greater possibilities 
that if one test form does not lend itself readily, another 
will. Secondly, the combination test adds rapport to the 
examination. If a variety of test forms are used, the 
interest of the examinee is sustained. A long examination 
consisting entirely of one test type breeds monotony and 
results in a lagging of pupil interest. Thirdly, the 
combination test gives the mental abilities a wider scope 
because of the variety of test forms employed. The results 
are more reliable when various types of mental reactions 
toward the subject-matter are stimulated and then sampled. 
It is apparent from the above reasons that the combination 
test fulfills an important function in the well-rounded 
testing program. 


It is highly important that the test-maker follow some 


Mee 


a) 
‘ 


at 


L- 


Mia 


a 

- 
‘ 
f 
‘ 


€ oe - ivy 6 tenaple 
; “ ve ee 


“f * yt 


“s 


* 
< 


a PUNE Pee 
v 
2 


ed 


{ 


- 
‘ 


[ae eS | es ee ,.hlU lh Oe. 


~~ ——— = vw Ca 


—— - 


general plan in the building of the examination. Just 


as a contractor is guided by the architect's plans in the 
construction of a house, so too, the test-maker must rely 
upon a fundamental plan for the building of his examination. 
Professor Ruch suggests an excellent procedure for the 
drafting of new-type examinations. The present writer 
includes it because of the attention that it attracted. 
Ruch suggests ten steps that must be carried out in the 
rearing of the new-type test. 
I. Drawing Up a Table of Specifications 

The Table of Specifications is a general guide or 
outline in the building of a test. Sucn a table insures 
that the entire subdject-matter will be covered. It prevents, 
also, the over-emphasis of minor topics which results in 
the improper balance of the sampling. This table is really 
a skeleton outline of the instructional materials. Suppose 
that a unit in Commercial Geography contains six main 
groups of ideas, and that these groups vary in their relative 
importance. In the table, each group title would be shown 
as a main heading and each would be supported by the minor 
thoughts listed below it. Wow, the test-maker would be 
under the necessity of weighing the importance of each 


group and assigning a percentage value to indicate each one's 


— ee oe oe Re ee ee Se Se So om ee Re Oe ee oe Se Se me ee ee ee ee ee ee ee ee es ee ee ee eee ee ee eS ee ee ee oe ee 


(2) Ruch, G. M., The Objective or New-Type Examination, 
pp. 149, Scott, Foresman and Co., 1929. 


=. 
oth bees a 
is 


af 


. 


Met si 


: b Rieke, = : 
. F ae » 
“ops cia ¥ j sat eo. nin intoome bra 2) 


at) 
be 


oRgre Hoph, xe 


ll Yad Yuen 


af 


. -engitaalmene eqytaes * 


ol if tay 


« 4 Py 
; ? ; efa neg 
< & > -fan 


t dat 5 au ‘ 
¢ z ¢ af 
LOB tS wh oF 
. 3 i a tr { as 
} “a 2. bo iw oxi a 
/ 2 f fi-u © & 
a in 
| nite Lo sleadge (e-Z0%e “ety 
{ lg 
. o j nals 
. : sig &. gogsisd. Z3g0 
Se 
« " » > : 
aid To. ehiigwe notelex 
a ‘ 
- 
a ed 
wae 
als 
+e 4 
— 
— - 
sf »* ne 


relative importance. For example, suppose that the weights 


given the six groups were 10%, 10%, 15%, 30%, 20%, and 15% 
respectively. This would mean that the first group should 
contain approximately 10% of the test items; the second, 
10%; the third, 15%; etc. 

Another important point to observe is that each main 
group in the table should be given a key letter in order 
to identify it. This procedure should make for greater 
order as each test item could be numbered with the key 
letter of the group to which it pertains. 

II. Drafting the Items in Preliminary Form 

The next step consists of drafting the preliminary 
test items using the Table of Specifications as a guide. 
In completing this step, the percentages included in the 
‘table are not to be followed implicitly. To do so wonld 
visit a real hardship upon the test-maker. It is sufficient 
if test items are formulated covering each topic and sub- 
topic in turn. Time should not be taken at this point to 
produce refined test items. | 

Ruch says that the important tasks are: 


"1. Covering the field thoroughly but at the same 
time avoiding trivial points. 


2e Deciding which test form is best suited for 
handling the particular question in mind."(3) 


Ibid, pp. 153. 


Os ; “3 JL ,¢RL “eee 


’ ,| ' < 
4 5 . - ~ ed Lae 
f c nls 
t ir 
if . i pee Sy 
ote pRaLs-, batds ent 3s J 
y r) 
# j La raat 
OU Sa Tnad youml’ tecveaes 472% 
| ; > , 


; nS gt 
¥ 14 ows 4-4 
; , j 

t Neer 
; way! $F Ms 
a 
~ PF» is 
® : - cus 4 
d 
: a a -— > « 
¢ Nizs a 
p: 
| a 
: 5 4 w¢ o2.6% 


cogs ehit astteiqmoee 


. 4 wale : 
> , TAL 
: o .. * a : 
. —s, aoe cor : a i 
; R J LLSBSbisa LBOT’ BB. tieia— 
= . , ba 
~ +4 omen 
dentd ‘ a! “ov «A 
-- 
7 f ee 4 
% 3 Ad wha ae 
i by 
= > 
= = an?® - 2 
A vs _—¥ ISM ADS 


oliw galétegd . 2 
eit yoll been 


eee en 


It is suggested that 35" by 5" library cards be used, 


and that the item be double-spaced to allow for correction. 
Library cards may well be used as they may be rearranged, 
discarded, shuffled, etc., without necessitating any re- 
writing of the other items. Each card should contain four 
things: the key letter, the test item, the indicated 
answer, and a temporary sequential number. 

At this point it is wise to remember that more test 
items should be written than will probably be used. Ruch 
holds that these items should aggregate 25 to 50 per cent 
more than the estimate calls for. The extra items insures 
two things, viz.: 


1. The culling out of pooriy worded items is made 
possible. 


2e A better balance of the emphasis between the main 
topics of the test may be obtained. 


III. Deciding Upon the Length of the Test 

Let us assume in the drafting of the preliminary test- 
items that-they number two-hundred fifty. It would be 
possible, then, to allow for a shrinkage of fifty items if 
Only two hundred items were necessary to exhaust the subject. 
This situation would be ideal from the standpoint of the 
test-maker because the latter would be enabled to plan two 
forms of the same test with one hundred items each. 


There are a number of reasons why two forms are more 


pe = ia. Ph wv « 


( ris 
1A ee OR ie 
“ Pent i Pe pate . 
'* wy a Pais 
‘ D 4 
s > SF. be . 
wal } 
* vA 
j aly: 
‘ aes oe ee? 
te 
= ae 


Vins 


volt betesgaae ef ¢2 
-o'csob ed’ mS%2 eft teat Bow 
es 60 Liew yam dbted Yrereee 
. ote: befttads , bebrpoaks 
,oneg? xéeaio sid ko sabeiee 
2eitel Yet edd tegaidg 
monpee _tetogmes 8 bas ,1908as 

et ¢! tatog slat ek 


nedvvixw od bilgods soezsd 


if leo eteamiteo elf dedd stem 
-.clV. ,Onnias on 


co i{itos ent re 
~oaldleacvy 


- 


Se] sotted & oa 


6 sofaos 
dit soqU antblosd ~s2i% 
iy ni etetee Be Ted 

et *edanin cea? janis ametl 
solls of \seod ,elciseog 
nog I pexihed ows tino 
ed Siow aoliaut ie aia? 
aL etd easeaned rtedam-Taet 
tw feet piige add To emrot, © 


s to wdmtt «2 ets etadl 


LOS 


desirable than a single form. In the first place, one 
form may be administered to the class and the alternate 
kept for absentees. Secondly, the alternate form can 
always be used for re-testing pupils who wish to take a 
test on a unit they have failed. Thirdly, the two forms 
may be used in rotation year after year. If the first form 
is used one year, the alternate may be used the next and 
the danger from old examination papers being handed down by 
last year's pupils will be minimized. Lastly, if a longer 
and more comprehensive test is desired, both forms can be 
administered. 
IV. Editing and Selecting the Final Items 

At this time the culling out of ambiguous and poorly- 
worded items has to be done. It has been thought best 
to allow a little time, preferably a day or two, to elapse 
after the preliminary drafts of the items. In this interval, 
the test-maker has a chance to mull the material over in 
his mind. It is possible that at the end of this period, 
the test-maker will have a greater understanding of the 
real significance of the individual test items. 

The test-maker should be thoroughly conversant with 
the principles and rules of rhetoric and grammar. It is of 
extreme importance that good sentence structure be used in 


the drafting of the individual test items. Poorly-worded 


wrtet efgataus iat ‘eau 


dood dstodetnimba ed Yen 2 7 
ieee 


yiinodee “stestneeds) tek: tut 


‘eot-ev T0R bean od) eyewia: a 
elle? evattegedt, tlan 6 26 tees .. 


Sol’ 


7873 cay getletod ai: been ed yao %5 
otetis ent sTES¥ ERO sean i 

y cols miigexe £40 mot?) 2egaa8 edt 
Siminim ed LElw elLlqgq a) assy saad: 


a se Ps 
‘set evienedorqnog sixes Bae oe 


‘l.enob ed oc ead emedl Saban 
soleta ,aalt eigtilL a words ot 
Leah yreniailesgeedy totter. 

[iom ef sonaentp 8 288 seaam-tae7 ods 
efiitnsog/ af Sh! \oRitta aid 

even Liiw. tetemedees eae 

pe ris 50 Sodas Lingte igor | 
isede Totea-teet ent 17 Qa 
seine Bns selgionite odd = 

“etnies Loop dene scnetrogal ouort. a 
teed fiacéiivi bat. edd so eeuthexb « 


125 


items lower the test validity to a marked degree. Hach 
item must be gone over with the same degree of care that 
an editor uses in conning an important manuscript. It is 
a good idea for the teacher to put himself in the pupil's 
place, and try purposely to misread the meaning of the 
test items. If any item is not perfectly clear, it either 
should be discarded or rewritten. 
Ve Rating the Items for Difficulty 

There is an advantage to rating the items in increasing 
order of difficulty. If this were done, the point to 
which a pupil progressed in his long term examination would 
indicate the pupil's degree of mastery in the subject-matter. 
At best, these ratings are not very accurate since they are 
highly subjective estimates. This rating may be done on a 
five-point scale. The rating "one" will be given to items 
sO easy that all or nearly all of the pupils may be expected 
to answer them correctly. The other numbers of the scale 
indicate ascending degrees of difficulty. A “five” rating 
designates the items which are thought to be so difficult 
that all or nearly all of the pupils will fail. Teacher 
rating of items is made more accurate if a number of 
teachers pool their ratings. 

VI. Breaking the Items Into Equivalent Forms 


It is at this point in the ovuilding process that the 


ee Dg 
r ae ; 
pa. 


‘208 stem Ss. 03 ane. teed ote 

saso Te sessed omer BBR ib 180: onoy oo ta 
wi@iyorsrom inst gee oe sa aNOe, ME, So08u oats 
ioe od verLonpas anit” tot 206k 

o:* 3G setala of ~leeoquaq Ure: pom. by: fg 
So) vate Le 21 13q fen ef. se eee eS iiss bs 
sett lies to Gebtecelsé ed sete 

Jive SSC sok emeth eng, salted. ig 
wiices of sastuaybe asset wnede, 
anod otew, edad ae -UtlsoLEhhs: to. <aF ST 
teitentou ized ane id at fbasesteondg iiqgag @ 4 
rietean ro eetgeb B' Sigag edit. t 

Yiev 76h, ote BRE iis ae 

widT .seteqlias ovizoetdse ida. 

: ie “sno” goivex ef@ lJeieog fetogroul 

iquc » © Lie efneon wet stay Toad eae | a 

ed? ghd cextes mets: newer 0 

ae ome 1t8 ie semmgeb galiagous otaptbar 
12 dplio ster ett astotstaes ” be 

ata wy efit So Lie giyseniae ihe fad 
é sletness om 9 ham /ad unit ahem 


.eynl lee slate f20¢ 8 


4 
3 
sD 


‘ceisvivg® ofnl- saetl eda patae 


> iis 
ad eposg golétiga edt ot taleqseintt eee 


examination begins to take form. As a result of this step, 


the test-maker will have two roughly equivalent forms of 

the examination. Up to now, it has been discovered that 
certain items are better than others for test purposes. 
Certain items have been found to be too-easy, too-difficult, 
Or vague in meaning. These unsatisfactory items must now 

- be eliminated until the numbers shown by the Table of 
Specifications are approximated. 

The next step deals with the sorting of the test items 
between the two test forms. This can be accomplished by 
taking the cards and dealing them into two piles exactly 
as playing cards would be dealt. It is intended here to 
equalize the forms through the law of chance. The net 
result will be that each form will have one-half of the 
test items pertaining to each of the six main topics 
included in the Table of Specifications. 

VII. Rearranging the Items in Order of Difficulty 

If the test items have already been rated in difficulty, 
it is a simple matter to rearrange them after the elimination 
of faulty items has been effected. 

VIII. Preparing the Instructions for the Test 

No two authors on tests and measurements agree in 

their instruction for objective tests. All concede the 


necessity for clarity, fullness, and brevity in the test 


of 4 sed noitTanimexs 
litw t*sitam-Seer ed¢ 
7U ,ucitacinexs oaz 


fetfec ets oaotl atest rep 


c 
& 
> 
— 
* 
pew 
Q 
& 
eo 
0 


at 
' 


Leases 


vo 


Sav 8G isi Last 


> fe he 3 > 


jaa eiquis # eL.3! 


ee | 
i 
“4 


Lewd 


instructions... In describing the instructions, the use 

of the edjediives "full" and “brief" seems paradoxical. 

The instructions should be sufficiently detailed so that 
they are readily understandable, and yet brief enough so 
that there is no excess verbiage. The need for very 
complete instructions will vary depending upon the ages 

and mentality of the groups in question. After the pupils 
have been subjected to repeated contacts with objective 
tests, they will become "test wise" and brief test instruct- 
ions will be adequate. 

In writing the instructions, it is wise to frame them 
so as to meet the level of the lowest mentalities in the 
groupe The simplest synonyms for all words should be used. 
A very good way to help the pupils orient themselves to a 
test is to include fore-exercises with each section. The 
pack iuan en is a measure of achievement, not of ability to 
follow directions. It has been the writer's experience that 
good pupils sometimes fail because they do not interpret 
the test instructions correctly. This situation is especially 
noticeable in cases where the objective test involves a type 
of test new to the pupils. For example, consider the Carlson 
bookkeeping test dealing with the working sheet. In this 
test, the arithmetic has been "dehydrated". In other words, 


the pupils are required to indicate the extensions of the 


Aa dh ac Cle Ore 


aiff 3 f ‘seg atia 2922 0VOXS -O10% oak tend. oa 


ri dg tesie Snbdiow odd dele pokioel, teed 


7 - | ae 8 1 ee AL an ibe ~ 7, a 
‘ ty Se fa 
> - Vee ber t ey a : 
AL : Le, = Ms: Ti ry ¥ 
< ~ ae a 
a §O ees 
ee 


pes adv .,nhotiodwant eat covobvoesh a wa 
insitotarag exesa "Te brd? bug “*rhieh ge 

uz setdaget Vit tet of Dia ee mem 

92 dywoss Weiaa fst fsis 9 isobhat ete Bia Elbe: 6 

rey xo Apen ent 2 Cyt ere ateexe On efor 

este Ags "soqr sot ireqel Saw Fee acti ooctd 

Lotti vsoltebag af squers- “ede! to wh 

. rege do dite -sticaitea heteager ey bo cat au 

Tsov ieite bre “oeie Jeet" eno ced tin y 


£ 


eferpobs ‘eal “ pe | 
au": Tr x hn A ? aid on ty Sat. edt mes: ted: 
‘legos ceeeor ssf ¥o foreL- ere te ec 


si? 3 geti¥e sfiienq ‘eas qiel oF yee 6203 : 


_issoeveites to estsesgem e Bt molt 
xe elisiicer a see tag? at seaostoenta ofist 


& 
4 
e 
fa 
| 
be 
be, 
ee 
tas} 
— 
i 
te 
ii 


Tepiatod  oLacsaas ho, Seek qua ede od 2 


Lenic nl @°betaibyies" teed san «teal Sain 


nolanet#e ent eveciinl. o¢ Sexippet “eis 8 


128 


five classes of bookkeeping items by means of check marks. 
The writer in his teaching experience has never found a 
secondary school class that has been able to interpret 

this type of test and solve it correctly, notwithstanding 
the fact that the pupils had previously received a thorough 
training in the use of the working sheet. 

The pupils should be informed whether to hurry or to 
work slowly and carefully. If tests are timed, this should 
be made known in advance. Another important point concerns 
the answering of doubtful and unknown items. Authorities 
differ as to the advisability of instructing pupils to guess. 
Dr. Ben D. Wood uses instructions against pure guessing. 
Dr. We A. McCall contends that the more guessing there is, 
the more adequate the statistical correction for guessing. 
A study of the evidence reveals that the weight of opinion 
is against guessing because of bad habit formation. 

IX. Marking the Answer Keys or Stencils 

The nature of the answer key or stencil will depend on 
two factors, i.e., the nature of the test and the number to 
be scored. It is patent that an elaborate scoring device 
will not be necessary if merely the papers of an ordinary 
Glass are to be scored. Furthermore, the function of the 
test must be taken into consideration. A test purporting 


to serve as a short check-up test will not necessitate the 


pe bee oe eee ee ee eee 
oo ge ae 


.3Ptan icete to veren yd emsrk aievaades te ee vit 
aot yewer sex Bonet Y6Gze | nttoned wht ee — fe 
<QiskhL ot sida soot sat vam weete foodc® sn0bee 

kh’, eftoetsen ee ovlo# bag) ted ‘to ose 
faytepey ivatq Bar atighg edt bicsdl 

,teore unbivew ods $0 ben dg tity 
ro yuu o¢ Taco ecm Beamer at od btboderal 


itivodipa .adett omitted poe feb ednee: to ants 

qete ou wt wbq Anttesas end Se Es Li teeeeeee eat ee 
RoR rts ays enotts stent wgeea phe es 
sods selesony ‘Svam ade rade wEROtAOS Liadot i 
soléeveun 182. voltes tide tgnktieltets eng, “eeengebe: | 
‘tao te tdoteW aft dent “sEsevet Gonetive ade 40 9 
-ioitemtel Jifed 029° 20 seaseed anidnens: de 
e[ifue?® yo-e7eX vem eae sree 


sciveh anivoon etewdsle ae tett daevee ae th af 
fenibso As I¢ steqad ont yietem aa ytsaeensa ie 


ey 


6 
13 

‘ re 
+> roigfemrt aig - OMrea tent -bot00e bade 4 ; 


P Je 7 
Petes 


(i94teg tet ke saebtsteblienos OF RE Reman A 


spades gon Lite tee? qi-toets eben 9% 


= > 2) 
ieee 

: | ae a 
oie me 


labor involved in the making of an indestructible scoring 


stencil. To spend time in the planning of such a stencil 
would represent an unwise expenditure of the busy teacher's 
time. It is probable that the best system to follow under 
such circumstances would be to score one of the mimeographed 
test papers and lay this alongside the individual test 
papers as they are scored. Ruch issues the reminder that: 
"It is ordinarily unwise to use any plan of scoring which 
calls for actual reading of the test items and the pupils’ 
0 A ae el 

The two principal types of devices for indicating 
responses are as follows: 


1. Aligned response columns, usually vertical in 
position, ee.g., 


The Mesabi Range is located in the State Ofecccccccccccccce 
The most important city in southern California iSe.cccceeee 
2e Staggered response blanks, e-.g., 

One of the principal products of China is corn, gold, wheat, 
tea, ironwood. 
Ohio is bounded on the west by Missouri, Indiana, Iowa, 
Illinois, Michigan. 

Even a@ casual study of the above test items proves 
the economy of time in the use of aligned responses. They 
are always possible with simple recall, matching, and true- 


false tests. They are often possible with multiple-choice 


-— oe oe Se Se ee ee er ee ee Se Pe ee ee ee ee ee em ee ee ee ee ee ee ee ee ee ee ce 


(4) Ruch, G. M., The Objective or New-Type Examination, 
pp. 184, Scott, Foresman and Co., 1929. 


. - 
4 < 4 
‘ mf a 
L< uv < + 
© 
~ ‘2 
a9, 
-~ & _ 
¥ 
es - uv 


er oe 


OP ve 


=, ere ) ; im : 
a ’ be Late a as WS , 
” 5 os ge - i 
7 ids 


e As te thse sat. ot devi vn ! 
enisgneta: gala ae eutt: oe bis 


peseal J9u2 ,59t0p8 Sra. Year 82: 


q [os 088 af seins eee 


.eiveb to. geayt Lagtontag: owe Bree. 
+ 3 


sd sci 2 siieanet al ens tdéaaeul aa + 


260. oa¢ edz «ues al ae 
sroos.o? ed BIsaw novela 
jiegnole elds Kel Sox Stes 


tesco ent? to seliges ae TO 


ie 
*, eoenoga ea 


. 
iewollol 2a ets sesnedg 


“sontlos oaneeers benaitéA. .f ¢ 
9-8 oliiaog 7 


‘reci7se0e of yile sarge toast seco 
i gamogeet Sotesgate epic 
cuboug feqtentag sce o"% 
pooner! 

teow 6a) fo BOpANOu ee ol 
snayldeltit Seer: qn 7 


rode 343 %o ybhete fesgead 8 ae 


Pi] 


a iiiw eféigegog aa i 


~ = 


ee 0 & ae Aine we woiity = ee 


vi-wei to esicosjso ect 9 oak Pe’) (60 
PL 4s@2 baa nadesio€® .tro06 (&6L af Cae 


slifesog astie era -yadt -oteet de 


ere 


130 


tests when the method of response is by number, rather 
than by underlining. 

Professor Ruch classifies answer keys and stencils 
as follows: 


"1. Strip keys for aligned vertical columns of 
response blanks or response words in such tests 


as: 

@ Simple recall 

b Numbered multiple-choice 

e¢ Matching 

ad True-false (especially the + -, the + 0, or 


the writing of T and F, etc.) 


2e Transparent celluloid or tissue-paper stencils 
for such tests as: 


a Unnumbered staggered multiple-response 
b True-false, yes - no, same —- opposite, etc., 
When underlined 
3e Cut-out stencils for such tests as: 
a Staggered (ordinary) completion 
b Staggered computation a 
5 
4. Answer sheets for reference.” 
The strip key is a very common form stencil. It 
consists of a strip of heavy cardboard from one-half to 
one inch wide and the length of the test page. The correct 
answers are written or printed on the key in such a way that 
when the key is placed in juxtapostion with the test paper 


the answers and the test items are parallel. If the key has 


been prepared properly, the short eye span between the answer 


=e ee @ oe Se ee Ss Se ee ee ee ee ee ee ee ee Se es SF ED ee ee ee ee SS ee ee ee me ee ee Se ee SS SS Se oe 


f6).. Ibid, pp. 176. 


Fe 


ak . todivom ent sedw evasy 


gtiallrebas vd neat 


ttitcasio dea soaks tors 


:ewollod @8 
e y 
evex agizis .£ 
asld eentvqee, 
cug 
* 7 ° iqmif gS 
o 7 ay ov if } 
iiotaw “© , 
J [at-e2rxT ? 
i Be % sq! ¥ 
ulteo tastaqsnet? 4 
; lone 160% 
S Te? bpeteomsnar 8 
“ : fat-enst 2 
: ‘ oa > iv SSW 


¢ +S ve * “~~ * ae - ‘el 
~O; SLisotecs FuIG-v ov 28S 


’ <> i ay £ 
steagzave 4 
b aot + 
art ToT SA +e 
a “ om - a wl 
si yad qitvea en 


r¢ sitistyY 6is8 erewens 
si yet edd sede 
‘ 3 di? boa etewece eds 


vlregotg betagqeig ceed 


(waoi——no—a = 62 et i tn, ie ll, lt le i i a a ly 


-ovi .qa kde tes 


and the item should make tor speed in correction. 


Transparent stencils may be made either of tissue 
paper or of celluloid sheets. If made from the former, 
they have only a limited period of usefulness and then 
will have to be replaced. A celluloid stencil should last 
a much longer period of time; in fact, for many years. A 
celluloid stencil marked with launderer's ink and dipped 
in white shellac is almost indestructible. In marking, 
this type of stencil is superimposed upon the actual test 
page and if the marking on the stencil fails to coincide 
with the marking on the test paper, the item is incorrect. 

The cut-out stencils serve much the same purposes as 
the transparent stencils. They are troublesome to prepare, 
although less expensive than the transparent type. Like 
the latter, they render their main service in reference to 
the correction of tests arranged in staggered response 
blanks. To make a cut-out stencil, take a sheet of thin 
Gardboard, a piece of carbon paper, and a mimeographed test 
paper. Superimpose the test paper on the cardboard with 
the carbon paper in between. Draw rectangles around each 
answer blank large enough to include the pupil's answer. 
Now take the cardboard sheet and cut out the rectangles 
that have been traced upon it. Below the opening of each 


rectangle, write the correct answer to that particular 


— Sa beg ad 3 a 2 oD e awn! ) aed 4 i + 
ee aa rd a 7 a: 


(o> stlea Sinods meti ead Bas 
egeelt to! spat f * mt Silomede tne teqenert 
, ponte bieigniles io 26° Tegeg 
boixeg het hats ® yino evan youd 
booniges od of avant [ilw 
mld ty Seateq tonool dona 8 
beste Licnetea bio tsi igs 
mt teoutz ef osiieie ering mes 
wivegoe si ifonete To edge sage 
nilaxem od’ Si bne s98¢ 
£ et eit no aniasas edd Ag58 
{toasts ta0-tn0 eat 
silonetsa Taetagernsiry eae 


cs evel dupeaste 


2? 


{ cbiet tert ,19%cei eae 
ax ra eveed, to nolfoestted ene 
so « ates o2 ).edtnate 
Fe ya a o's o 19 evela s , bsecabhies 

scogmlteqese - +. teqs¢ 

DE Be nit xeqed nocis8o eng 
‘6 etal Aval@ Teves 

hiacdbhi20 off ete? vou 

108 > cu beces’d geod evad vas 


op enc oficw.,efgnetoes 


block. It can readily be seen that when this stencil is 


superimposed upon the individual test papers, it will be 
an easy matter to check errors. 

The use of an answer sheet in scoring papers has already 
been explained. It consists merely of a mimeographed test 
with the correct answers written in. This is the only type 
of key used by many teachers. 

X. Deciding Upon Rules for Scoring 

There are a few general truths that should be emphasized 
at this time about scoring. Most experts hold that partial 
credits should not be given. Questions should be worded in 
such a way that they can be marked either completely right 
Or completely wrong. In completion tests one point credit 
should be given for each blank which is correctly filled. 
Matching tests give one point credit for each pair Reapety 
matched. Some teachers make the mistake of attempting to 
weigh test items for difficulty or relative importance. 

This is not considered a commendable practice as it is based 
On mere teacher opinion which is a highly subjective factor. 

Any section upon rules for scoring would not be com- 
plete if certain controversial issues involving correction 
for chance effects were not mentioned at this time. In two 
response tests (including true-false, yes - no, same - 


opposite, etc.) and in multiple-choice tests, the scores 


Re xx! Si 
PF ee 
NSHs 
* 
: 
- Se 
eA 
sods 
4 
«€ 
e 
+iri ¢ 
f 
i 
‘ 


i ae ok 
: OM ets 
fe ae 
ed yilbses aae FT - Adete ae 
a 
hol ed¥ meqe boaogmuttegae aa 
6 Aces Ot 4el tem YRROURE 4 
TSWete me ts sao edt 
inaon- BF -benieiqxs seed 
s1ewene voCeTIOo eng ache: 
one? Yona yo heen’ yew Fo 
seluk nog noliioeds aa 
‘Teneg wot »s ets etedT | 
ssa Jeoda aml? elas ee 
evizn ad ton bincde- et lpegs 
18o TEAS 2add yer Ss cose 
o al .anou¥ vletelqmon Te 
voi cevia, ed> biseds 
eq ono evils efset pnoidovas 
| etenceet emod: a betovem 
103 amett- Fset Geter 


iiboles. ) atesd secogees 


boroblenco’ ten Si e#iaF 
tniagd. se4aD8eT etom 8 
Hoos not’ses Yas 
o nisvise rlereiy: 

tow sfoette sonedd tot 


ai bha (sete, stieogge 


should be corrected for chance. In two response tests, 


including the true-false type, the correction formula is; 


SesR-W 
S = the corrected score 
R = the number right 
Vi = the number wrong 


Sometimes this same formula is expressed in the following 
Way: score = number attempted - two times the number wrong. 
By using either formula, the same corrected score should be 
Obtained in reference to any specific test paper. In 
correcting multiple-choice tests for chance effects the 


generally accepted procedure is: 


We 
SiR nal 
S = the corrected score 
R = the number right 
W = the number wrong 
n = the number of possible responses presented to the pupil. 


It is reasonable to state that the guessing factor becomes 

of decreased importance with the increase in the number of 
possible responses. Lang says, "It has been found, however, 
that if the number of the options are four or more the 
guessing factor becomes sufficiently reduced to make the 
correction for guessing See eahaerel Pt ce Studying opinions 
from the leading authorities, it may be concluded that 


correction for chance need not be made in the case of the 


multiple-choice test if the number of suggested responses 


(5) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 133, Houghton-Mifflin Co., 1930. 


. ee - 
TOL fetoetteo ed Sigeds 9 
o oud ogy! st (eigsiat sit Saipotemes 


4 
~ 
& 


vicoe Sebserroen sadiedir| 
tdsix tecuva edt Ss Ba 


- tevanotte vretaco se steens ree 
‘nov ott , sind sedete aloe. 


yas of sonetetet, at penletdo & Pre 
7 


clodeseLoli Lin aati oesee 
yhesetq Se7q0008 citeroneg | 


81065 betoext090 ae » 2 
fdelx setmin elf «2 2H 
snore twdan ott 2S 

,00g to tedmom save 2 


oldanosset et of 


c 
ay 


senses rOqatl beageios6 to 
al ,aeenaqest ololegog a 

 hdman edé- fi 3 adele 

eomened rotcet nobnaeny. | 

* [ATi Shregesgy +04 noltoes10g , : 
nebt isentuae gpobihest ens mort 9 


ion heen esoreas wor coli 093198 © 


“e cunt’ oot Fk tees euiodo-o ight fai! 


oes wt to on sy ee iS leit Sete 


hk ovemLas meet 9 e 


iimncidayeh ,SSl ».gq .en 


3 
— 
g 
ee 
. 
{% 
t 
~? 


eee ae 


number four or five. e 


Methods for Drafting Parts of the Combination Test 
So far we have considered in this chapter the combin- 

ation test. As we have previously seen, this type of test 
consists of both recognition and recall items. Recognition 
tests consist of three main types, i.e., true-false, 
| multiple-choice, and matching. Recall tests include two 
well-known types, i.e., Single-answer and completion. It 
has been held that chance effects are a much more serious 
problem in the case of recognition tests than with recall 


A 


tests. In reference to the completion test, Lang says, 


"Item for item, it is probably the most reliable of all the 
new-type Sddtheebaneihon good combination test will 
include both recognition and recall items. 
I. The True-¥alse Test 

True-false testing appears to have given the original 
impetus to the new-type testing movement; hence, when 
Objective testing is mentioned, some teachers think of true- 
false tests only. The ability to discriminate between 
truth and falsity is a valid measure of the pupil's accomp- 
lishment in the subject-matter covered by the test. It is 


unfortunate that so much emphasis is placed upon this single 


type of test to the exclusion of other more-valid types. 


(6) Ibid, pp. 107. 


eww) eS. a ee 


ies. ac 


ire 


slew ine \eotedo-elgiiinn 


ow oven o¥ tat. oe 
sved ee GA .tset Solis 


ingeney Ated to evaelanos 


au 

a oY \ cL. or 

7 alt = 
: cog 


evil 1o wot tedaua 


solthest sot abodfell 


satay to teisneds etese ae 


ot (beget awont-i few a 
caip. sant pied noed*ead 4 
esac ede ot meldorte ‘ 
of seneteter at alee? am 
t tl ,mertd 26s meeEe a 
* enol tan inexe oats Fen 
finseoned Atod ebui onkoem 
ef osfst-enxrt ent 2 in: 
igeas selel-svat 
svi-web sit oF suveqms 
e gnivess evisootds 
* ,yino ateet ealat 
at etielst ban dtait 
dre edt ol treniet iim 
of tant ptann? 10 LA y 
sit of teet te say & 
GSE ee 


eS 
OL seq , b1GE Cee 
me | 


Some authorities feel that the true-false test does not 
merit such emphasis when other more-valid test forms are 
available. Lang concludes that: “Its extensive but not 
exclusive use is SEAS Luis (et 
If the disadvantages of the true-false test are borne 
in mind, this test is a valuable aid in assisting the 
teacher to arrive at safe conclusions as to the accomplish- 
ments of entire classes or individual pupils. No single 
type of test is efficacious in bringing out the exact degree 
of progress that pupils are making. The status of a pupil 
or an entire class is arrived at by an averaging of all the 
results obtained from the well-rounded testing program. It 
is no mere accident, however, that the true-false test has 
been used to such an extent. by the ordinary classroom teacher. 
The following advantages are cited in its defense: 
1. It is the simplest and most adaptable of all the 
new-test items because of the ease with which it 


can be prepared and scored. 


2e It stimulates desirable mental processes and 
attitudes. 


3. It can be given satisfactorily by the dictation 
method if duplicating facilities are not available. 


4. It ranks high in degree of rapport from the stand- 
point of pupils taking the test. 


5. It is possible to cover a wide range of subject- 
matter in a brief time. 


(7) Ibid, pp. 103. 


y: = . ’ hg a , . <2 
- ; . a - { 
é ~ Pe Ae poy 3, ; ii ch 
Ma Le) ag A i‘, pe | 
st iq ee 
“<5 ut ©) ee 
‘en® 3 


fen woos ded on iah-eutd oft dase chee ‘salits 
exe dxvzet-tue® Etievseicdn sents eer: sleadgeo cou 
toy evienecxe sot” a coin feces, ated ie 

* do sromm@pene ext 

ned BA Let ~cors cee ee cogetastbests 

vit BRgderes ct Sha eieem Las seb vaed a 

ofsil (cuests-é yy Bs alert one: oleae oe eviize ey 


oo.) ,~oiterq Saghieifiel ce asegais oiltae Fe 
“web teaxe siiy tho clentasd af evotesokttar ek hse 30.0 
‘igha # to ecvere oft .nnided eve elit eane ee 

rie qiqsteve sa yo ¢s- beviwte eh eeene pee 

r ; guitees beanser-Liew emt meas seanseehiael | 
-onnd.-eha gid avowed, ,20ghtoee erem Ona 
® vienlbyo end ed ipetke Ge 2088. oF poem | 
teh act ob fetie ex Senet Sa 
ushs tecm bos. deeiquia-ear BE ek 


te 080808. amar tnet=wen 
betes mie betegerg ed O89 


t 


« cantesb: eedeigomgses st 
-oopusivis 


2a YLlixétosteltas seve 


od sso #1 
melti tis wnitisotiqub FT. 


2 
& it bodéem... 
fzeog@eat to se1r>eh wl Ged eeaese ets7 

yiee? ont golued. eiiqne ee ee 


2 ko euneg ebit £2 Tevet. oF aleleeag si aE» 
»ont? telct «8% netien ‘ 


a ~ ~- 0 oe et on ee ee ee ee at ne ~ ee ee a me ee ee ee ee ee me ee 
‘ 


i 
30s Se Rt 
on 


136 


The two criticisms usually levied against this test 
are the following: 
1. It offers a golden opportunity on the part of 
unprepared pupils for guessing the correct 
responses. 


2. The false statements that it contains give the 
wrong impressions. 


Let us consider these two criticisms in order to find out 
how valid they are. In considering the guessing factor, 
some authorities differentiate between "pure" and "brilliant" 
guessing. The former consists of making a chance decision 
when there is no basis of knowledge; in the latter, there 
is a definite basis of fact. While the former is to be 
deprecated, it is generally agreed that the latter is not 
undesirable as it parallels a life situation. Countless 
situations come up where decisions have to be made on the 
basis of the knowledge already acquired and the person who 
has the most adequate store of facts is the one who can 
guess best. The guessing factor does not cause serious 
concern if the test scores are corrected for chance effects. 
Lang says, “If the true-false test is sufficiently long and 
properly prepared, and if the score is 7) ake for guess- 
ing, the guessing factor is not serious." 

The second criticism really arises because of a failure 


to distinguish between testing and teaching. Persons who 


(8) Ibid, pp. 111. 


P > or a te 7 ¢ 
oe ee : Sots Vw A 


Yea ead Bo vi lastcongs neties 2 aae2rord7 
saxtpo efis a tot ellasg Betadezcas Sons 
seosnogeet ~~ / He 


BO! adie Tac? 8s gepetere éale® ext Sahel 
scp leeemgat Sao1w 


loitide owt evant rebinsion ea, 

ae 

atastienoe aI sete, cont ura it 

LAN 

& Pret so “etan”. seswisd eisivie2eti iB aah ow bin 


Si ~eR de laces? He on ot oxedd ‘¢ : 
sitdv .tost 6 aisad otigites 2 

vt sore sit poetes (Lleteneg at tt Sedan 
ried .reoiftentie eilis sisiiciee vt BR a ldertihe 
hem od O¢ ovad endtefeoeh etedwe ai, 6n0o naol tau £8) 
ipos theerts enioi vous ext to olen | 
ny Sf ataat to erecta: otanpens Tou edt | 
ecHee SCs oh sotost galagess edt. 7650. ery 
<8 Set068 rbot elt it axwoueo . 
cry, 
Pebtiv foes yelet eta eds ti" Pan Boa, 
a7co0g of3 Sf base .dezegetg wine 1 
"a oteen cor eh -eGee eu a 
10 aosaceo Senixes viieae mat ott feo a 
"2295 »giftcee? oanb. sritsog hoowted delgs ub me 


Ke oe 6 inp ee ee me a tcp ey ne a tina eee 


raise this objection point to the negative-suggestion 


effects of true-false questions. They maintain that the 
false statements in the true-false test tend to leave the 
pupils with false impressions as to the subject-matter. 
Now, the crux of the matter is this: if the test follows 
the teaching instruction, the pupils already should be 
conversant with the subject-matter from all angies. Right 
impressions have been formed if the teaching has been 
eifective, and the test is merely a measure as to how well 
the learning has been done. 

As to the construction of the true-false test, any 
test-maker will find it profitable to examine C. C. 
Be Ses entitled "How to Construct the True-False 
ein tionke' This monograph is one of the most comprehensive 
studies that has been completed within the last decade and 
should yield a wealth of information. The construction of 
a good true-false test cannot be accomplished by taking a 
number of statements from the textbook at random and turning 
half of them into false statements. Ambiguities and partly- 
true partly-false items can be avoided only if special care 
is taken in the planning and actual drafting of this type 
of test. 

Carlson suggests the following rules in true-false 
test preparation: 
(9) Weidemann, C. C., How to Construct the True-False 


Examination, Teachers College, Columbia University, 
1926. 118 pages. 


> 


tt em ee ly ie we 


it tonuréaaced of BO, 8: sO [ : 
2) ges f 


oo 


Oo. Sr Sgiy oat an eoocenter 


ey 


ee endl ‘seotqinl bis? Ati ef 
fet 
icy ef r9ddie ed@ Fo ana eee 


qq ‘ads acy tad ia sntsooe® eat | 


160 Ueiifed | ot dia ead Asie tneeve it nos 


=> a 7. 
7 


‘il Bbetexet see eyed anoles ex jin Lae 
:: Be 

clexem- el taet- ede 6B wi’ 
et ot Boe \e texte: 
nob seed ait paairaal 9 
ec < 
i 
eiéetitowg i base Pile wenlameie cot 


Lo noltoustened oud of 


ry 


y= 


an _- 


eon” BhoLtt bab yaude a5 nics as 
{@) 


ser 


r 


at doptaonca skates "not den 


eee 


4 
ts 


bevelgeos Aseé Ban sand sets 


= 


ver 


msorts ad a3 ibs! & blely piven 


oe 


conias jeet eeiel-epad ane | 
oii mort e¢semetete to ‘ned 
/ 


tionetate safer otat nent to an 


2 ot nao Bietl oo tot tag nah 
covoe fing watome te ont at 1 ect aE 


ee op ach sateal og baal ds Seca oe ee 
F ; 


»e3elfed ersigger snokes 


"). Prepare a long list of true statements covering 
the suvdjecit-matter to be tested. Then proceed 
to change all of these sentences into good true 
or false items. 


2. Make approximately half of the items true. 


3. Hach test should contain at least 100 items in 
order to yield reliable and discriminating scores. 


4. Hach true-false item should be as short as possible. 
The standard length is from 10 to 20 words. 


5. Avoid the use of negatives especially double 
negatives. 

6. Avoid the use of dependent or modifying clauses. 
Most of such statements may be separated into two 
good true-false items. 


7. Avoid the use of items which are partly true and 
partly false. 


8. Watch "specific determiners" such as "all", 
"always", "never", and "degree or comparison 
statements". 

9. Duplicate, mimeograph, or print the test. 


10. Make adequate provision for indicating the 
responses."(10 


II. The Multiple-Choice Test 
This type of test is intended to measure kind and 
quality of reasoning. A true statement is made and after 
it is given a number of choices as to why it is true. The 
examinee is required to study the possible responses and to 
select the one that represents the best answer. Great care 


must be taken in the preparation of this test to insure 


—— oe oe ee ow oe oe oe oe © oe 6 oe Ge © Se ee So oe om Se ee Se ee SS oe ee ee oe Ge SS Oe SS Se ee ee ee ee oe oe oe fe Se Oe om Se eS 6 ow om oe oe eo 


(10) Carlson, Paul A., The Measurement of Business 
Education, pp. 18, South-Western Publishing Co., 1932. 


+s e < 7 os Pisa? A 
ET a ae” ae ee Wee 
¢ sy ee alae Lily io 4 Mi, le 
Wen ae ae 
' a) eee AR H ; 
4 i ‘ oa 4 a" 
0 ES a) cme aes ane Oa 
Ds aar? as ea 
¥ ean” 
* ° a. 
he 4 
< > oo” ‘a 


he ali goel gs exegetf) ie" 

uit? -,S5e%cet ef of tathametosceme end a G 

wi Gpcostaos eselY fe Jia @atege.o¢ 
sarl BSalez 10: 


4 
rm 


wikotgge 6vei »f™ 


i JITHO0o baronies i507 dost o& 
if .2% BIS OF) tebto 


+ eg 4s malt Soh restr tees ae 
-~ + a) Atw@et SEeGRera ent 


ovlitanen i¢ ses, gad. Blevs, ae 


i= . 
«30visssen 


a3! ose 25 ta@el 
; aI-eLstt* boos 


4 +e 5 iW SloOVA *¥ 
.esisit Yiltteg 
ob oltfoeqe™ dove” As 


ruven" | Seveniat 
. "atneneserse 


Lat ala ,etaoiiaqn® .@ 


Slaiverq stasgeba aia 05 


“A rT te » __— - 
.* Tr} »-5898aoCeet 


* 


top? evlodS-elqie fol eats aan 


ast e equ eidt 


‘ shicds Yo asdatia BS teva 
Y gtdolewns sufe od bDenisgeg es cotlmexe,/) 
.Teareif 4 G- & escTeee oT che 650. ont fostom | 


ersoetg 6dt al neitat 6¢ tem 
& << ss a ee ee —_ 2 & wns ees = ae 


Fas 

itenesceaed off TZ, sk fest ioulsae™ 408 9 
(ead nistiet attack “Os ge so a 
/ ar 


that the possible responses, or "confusions", are of equal 


plausibility. The guessing factor may be minimized if the 
following precautions are observed: 


1. The confusions in each test item should be listed 
in chance order. 


2. The correct answer and the coniusions should be 
equally attractive and familiar. 


3- Provision should be made for correcting the test 
scores for chance effects. 


From the standpoint of a practical commercial teacher, 
the one serious defect in the use of this test is that with 
some types of subject-matter it is difficult, almost imposs- 
ible, to get four satisfactory responses. 

III. The Matching Test 

Matching tests may be used to measure either factual 
memory or judgment. In this test two sets of items or 
expressions are given and the examinee is required to match 
an item in one set with its correct answer in the other. 

The weight of authority is toward limiting the size of the 
test to between ten and twenty pairs of items. It is desir- 
able to have the list of items in at least one of the two 
cOlumns arranged in alphabetical order. If this is done, 
the pupil is more readily able to find the items. 

Care should be taken in the planning of this test to 
make one of the columns longer than the other in order to 


prevent perfect matching. This can de accomplished by 


anoftie 3] Ll .neite factPededeta ol feyneree anu tho | 


Vo > POSTOUIOS ‘eo fataee at” 
iniafia «@ Tin * ast aa? seeds ent cee ae rs 
sido axes exadsaevoxg “7 


Bie vite . tee al anc funtene ait 
te iB eonanc nt 


Ava tevenea Seers68 sat 3 ne 
us evi? opreeei ei ian os 


3 od biroda-netelvoat 8 as ee 
-hiS6iTs s0nmls SOR BeOO8 ve 


oe: ed} of joekeh gaolsee ond ¢ 
il vel? su-Foetdce 36 seqré ‘enon | 
cio optelfae wet tensor 9 ane 


soTt-gniidotsaHl edt /TIt, gf . 


fasy vid? Rl  jipemgset 40 Yxome 
LAX off. Sand Nevis ote cao beeentEee 
st dxtw fag epo al’ sted a 
cavot- el Ytbhuoddpes to figtow ee 
1ie> YteheRt bos net asewies on saad 
i) taedt Yo datl” edt ovat oF ofda | 
oft? sica yiloeer ecom eal ah 
“Jtifeeig en? Bi ribsaS o@ aiRese: bec 


adding three or four plausible items to the second column. 


Odell says, "Matching tests fit almost all subjects, al- 
though not all portions thereof; and where they are appro- 
priate, constitute one of the few best types to édgtie ee 
IV. The Completion Test 

Completion tests are included in the category of recall 
tests. One of the main advantages connected with them is 
that they are very free from guessing and chance effects. 
In the ordinary type of completion tests, blanks may be left 
almost any place in the sentence. Carlson says, "Modern 
usage favors the single blank space at the end or toward 
the end of each incomplete ee ie haa evans care 
must be taken in the drafting of the test items in order to 
insure that key words are the ones omitted. As has been 
previously emphasized, the completion test rates highest in 


regard to reliability among the new-type tests that have 


been devised. 


(11) Odell, C. W., Baucational Measurement in High School, 
pp. 492, The Century Co., 1930. 


(12) Carlson, Paul A., The Measurement of Business 
Education, pp. 20, South-Western Publishing Co., 1932. 


tego? seuliiotek” 


ehoivgog [fe ten 


141 


He SUMMARY OF EXISTING PUBLISHED TESTS AND EXAMINATIONS 
IN THE SOCIAL-BUSINUSS STUDIES 

Any writer dealing with the social-business subjects 
soon gets the impression that here is a fertile field for 
study and research. This impression becomes especially 
pointed when he comes to the phases dealing with tests and 
measurements. ‘There is a crying need for competent workers 
in the test field. More and more, tests are being published 
that deal with social-business studies; even as yet, 
however, the published test materials are inadequate. Dr. 
Tonne says, "The printed new-type test is apparently the 
least used form of test in the social-business subjects. 
The reason for this is obvious. As far as can de ascertained 
there are no printed new-type tests available in banking, 
history of commerce, advertising, salesmanship, business 
Organization, and business TE the test-makers do 
not supply published tests in these subjects, the conclusion 
is inescapable that the ordinary teacher must avail himself 
of the new-type test techniques and prepare his own tests. 

Some of the best test materials that have been published 
deal only with the subject-matter of the particular book 
that they are supposed to accompany. This type of test is 


termed "a published text-book test". ‘The bad feature about 


(1) Tonne, Herbert A., and Tonne, M. Henriette, Social 
Business Education in the Secondary Schools, pp. 28, 
New York University Book Store, 1932. 


i hefnione: 


. ap 


> te: ; FLeMOTR ORR AX, 


_ - r 
| * SY 
' 
f a! rT oc 
; ee 
’ » 4 
4 oe | « ‘ 
- ‘ 
: 
: ‘ - b 
oa % 
me ‘ e° 
pe: 3 


+ 
. Vile Miva 
- 4 cr 
= > * ra 
a 
' “4 
/ 
4 = - 
o2? * ooh Le 
> 
- ec, 
3 tr S09 
~ - 
Lo us 


~ tab Es 
> - as itn ete ana ne am re “nd 
Se ae 
A ttedrted sate dee 
i teevind agen ) 


* P 
i's 
me a OF a ae a Y 


142 


such a test is that its content does not necessarily cover 
the entire field of subject-matter if it is based on merely 
one text-book. A partial answer to this odjection is that 
most recognized text-books in the social-business studies 
do parallel pretty well the present knowledge about the 
sudject-matter with which they deal. One of the chief 
deterrents in working out tests in the social-business 
studies is that educational experts do not agree as to what 
materials should be included in a specific subject. 

The test material that has been published in the social- 
business studies has been favorably received and widely 
acclaimed. The gloomy picture painted in the above para- 
graphs should not be a source of consternation to the 
progressive teacher. Even if we must grant that there is 
@ dearth of published tests, each year brings forward new 
Suggestions and teacher-built tests that merit more than 
local importance. At this time, a Summarization will be 
made of the pudlished test. 

I. Bookkeeping 

There is a question as to whether bookkeeping should 
be included in the social-business group. The answer to 
this depends upon the aims and objectives of the specific 
course. If the teaching emphases is upon bookkeeping as 


the study of business rather than bookkeeping merely as the 


: ‘, i ae 
etaerredeb | — 


1 


e x + 
& aD am . N vee 

. Ac Pee 

* . = 


Aviom 
CaOClR Bi" » ONL SLP OS” > 


s ¢¢ ton £19008 BAGete es 
ia), (0 


ce aA , 
«TO 0R6T 6VLEGe TROT. 5 


;. 
_ 


hae 
Denes LGM Te Avtseb 2 ¥ 
bod 4 
~Yoe:iites Pie Biol TSORg08) 


ss ih 
‘ ’ ; “ ra oe 
J j>BO i) iJ =G Geb favo ky 


Lolteenp s 8b een® 


tooe ead af Bebs ford 
i2- eff soqe abasged aids 


training of vookkeepers, the subject comes under the fold 


of the social-business studies. The present writer favors 


the teaching of bookkeeping as an introduction to business 


and consequently, would include bookkeeping in with the 


social-business studies. 


The most important published bookkeeping tests in the 


field of secondary education follow: 


l. 


Bookkeeping and Accounting Tests, Paul A. Carlson, 
South-Yiestern Publishing Co., Cincinnati, Ohio. 
Series A - 9 unit tests for the 15th edition of 
the "20th Century Bookkeeping and Accounting". 
Series C - 6 unit tests for McKinseys's "Bookkeep- 
ing and Accounting". 

Series D - 12 unit tests for the 16th edition of 
the "20th Century Bookkeeping and Accounting". 
These tests furnished without charge in any 
quantity needed to schools using the books with 
which they correlate. Series A, C, and D are one 
cent a test for other schools. These tests com- 
prise some of the best-known published tests in 
the entire field of Commercial Education. The 
author states that his tests have a high degree 
of validity but he has not indicated why he is 
justified in making this statement. Printed 
norms are supplied with all the tests. The 
published coefficients of reliability for Series 
D are the following: Test 1, .895; Test 2, .937; 
Test 3, .939; Test 4, .918; Test 5, .8963 and 
Test 6, -880. Professor Carlson is considered 
one of the pioneers in this field. 


Bookkeeping and Achievement Tests, Charles E. 
Bowman, American Book Co., New York, New York. 

Six unit tests based upon Bowman and Percy's. 
"Principles of Bookkeeping and Business (Elemen- 
tary Course)". Cost: One form or set, con- 
sisting of 15 copies of same test unit, 20 cents; 

6 sets, covering the 6 test units and assembled 

in one package, 96 cents. Manual and key, 12 cents. 
Achievement tests covering the "Advanced Course" 
have been prepared by Professor Atlee L. Percy 


7. 


and published by the American Book Co. Six 
unit tests. While these tests have been 
carefully prepared, there is no available 
information as to their validity and relia- 
bility. 


Bookkeeping Tests, J. Hugh Jackson, Thomas H. 
Sanders, and Alexander H. Sproul, Ginn and Co., 
Boston, Massachusetts. 

Eight series of tests (4 tests in each series). 
Series I-IV for first year course; Series V-VIII 
for second-year course. To be used with Jackson, 
Sanders, and Sproul's “Bookkeeping and Business 
Knowledge". Cost: Per series, full packege (30 
copies of each of 6 tests), $2.40; per series 
half package (15 copies of each of 6 tests), $1.20. 
These tests have been carefully prepared, yet 
there is no available information as to their 
Validity and reliability. 


Bookkeeping Tests, Fayette H. Hlwell and James 

B. Toner, Ginn and Co., Boston, Massachusetts. 
Eight series of tests (4 tests in each series), 

to be used in connection with Elwell and Toner's 
"Bookkeeping and Accounting". Cost: Per series, 
full package (30 copies of each of 4 tests), $1.60. 
Per series, half package (15 copies of each of 

4 tests), $80. No information relative to the 
Validity and reliability of these tests is 
available. 


Elwell-Fowlkes Bookkeeping Test, F. H. Elwell and 


Je C. Fowlkes, World Book Co., Yonkers, New York. 


Two parts, one for the end of the first semester 
and one for the end of the second semester. Two 
forms. Cost: 25 for $1.30, with manual of 
directions, key and class record; specimen set, 
$.35. There is no available information as to the 
validity and reliability of these tests. 


Bookkeeping Tests, Fayette H. Hlwell, Ginn and Co., 
Boston, Massachusetts. 

Seven series of tests (4 tests in each series), 

to be used in connection with "Bookkeeping for 
Today". No information relative to the validity 
and reliability of these tests is available. 


M. Be P. Objective Tests (Series A), Nathaniel 


aed 
pert 


Dp 


4 ral 
ey 
+4 
oe 
Hee 
oe £ 
ee Fa : 
¢ 
‘a ¢ 
= 


te 


Atholtz and Louis Broverman, Lyons and Carnahan, 
Chicago, Illinois. 

Six parts covering important units of subject- 
matter in Atholtz and Klein's "Modern Bookkeeping 
Practice". (First Year Course) Cost: They may 
be secured without cost other than postage for 
Shipping where the Modern Bookkeeping Practice 
Test is used in class work. There is no available 
information as to the validity and reliability of 
these tests. 


Rational Objective Tests in Bookkeeping and 
Accounting (Series A), Clyde Insley Blanchard, 
Gregg Publishing Co., New York, New York. 

Ten unit tests on the bookkeeping usually taught 
the first year. Cost: 5 of any one test for 

10 cents. Teacher's edition, including one set 
of 10 tests, manual. of instructions, and one set 
of keys, 25 cents net. There is no available 
information about the validity and reliability 
of these tests. 


Rational Objective Vests in Bookkeeping and 
Accounting (Series B), D. T. Deal, Gregg 
Publishing Co., New York, New York. 

Teacher unit tests each on one of the chapters 
of Belding and Greene's "Rational Bookkeeping 
and Accounting". Cost: 5 of any one test for 
10 cents. Teacher's edition, including one set 
of twelve tests, answers, and manual of instruc- 
tions, 25 cents net. There is no available 
information in regard to the validity and relia- 


bility of these tests. 


II. Commercial Law 


While it is true that the field of bookkeeping has 


been covered fairly well by published test materials, the 


Other social-business subjects do not enjoy a similar 


advantage. 


The subject of commercial law has been covered 


by published test materials much more fully than some of the 


Other social-business subjects. Certain of the published 


Objective tests are considered excellent from the standpoint 


t 


' ; é a) 
> v 4 ie = a 
hate ) Sa %ad ic 
+ -~ 2 s ’ 
v< 
. a 
aR 
n ' 
u ~ . 
, 
‘ re > 
o i, 7 2 da 
a ; “ary 7A 
Aye gat Li? @ 4 
= - 
~ -«& 16> 4% 
- t+ mn 
2 eae Pas 337) 
s - 
- <« 
© = a ito 
n+ 
a 4 = w as - 
aty v r¢é 
a |e - 
4 Bit - ay 4 
= - a . 4 4 - 
paay t4 
a ~ x2 ; 
4 ~~ ae ee ¢ 
“ a ‘ Tihedals 
- -} =e 
| = ’ 
- hs Mb me 
“ — i « 
7 - ¢ 
> > J - 
» - - 
- @ 
’ 
- = eo fn. led 
4 
~V 


i 
a 
% oe 
tae @ 
hee 8 see ST fe ,o7Rokd 


« - 

AA 
> 
OR 


Ps 


r al tz 
Be aS 
o-e8 
* * pais 
if-loioos rato” 
_<, 
ie 


of test construction methods, viz., “Commercial Law 


Achievement Tests" by Peters, Greiner, and Green. The 


following published tests are considered the leaders in 


this field: 
ea 


Commercial Law Achievement Tests, P. B. S. Peters, 
Lloyd &. Greiner, and Fred H. Green, South-Western 
Publishing Co., Cincinnati, Ohio. 

Ten unit tests based on Peters and Pomeroy's 
"Commercial Law". Cost: Set of 10 tests, 24 
cents. These tests have gone through many thorough 
steps of validation. It is safe to say that they 
are among the best published tests in Commercial 
Law at the present time. The published reliability 
eoefficients are: .895, .874, .893, .888, .918, 
0854, .822, .877, .876, and .850. 


Case Proodlems and Tests in Business Law, Frederick 
K. Bentel and Carmen G. Ridiker, Ginn and Co., 
Boston, Massachusetts, 108 pages. 

Book consists of 11 sections based on major div- 
isions of subjects. Each section is subdivided, 
providing five types of exercises; True-false, 
selection, and completion tests, which measure the 
degree to which the student has grasped the prin- 
Giples of law; case problems and business judgment 
tests, which measure his ability to apply the 
principles to business. Correlates with Huffcut's 
"Elements of Business Law". Cost: 52 cents. No 


“information in regard to the validity and relia- 


ies) 
e 


bility of these tests is available. 


Questions and Cases in Business Law, Clyde O. 
Thompson, American Book Co., New York, New York. 
Sixty-nine test units based upon the usual subject 
divisions of commercial law. Cost: One set, 
bound in pad form, 40 cents. Manual and key, 36 
cents. No information relative to the validity 
and reliability of these tests is available. 


New Burgess' Commercial Law Diagnostic Tests, J. 
H. Cox, Lyons and Carnahan, Chicago, Illinois, 
110 pages. 

This book is a group of objective tests. Design- 
ed to diagnose and test the needs of commercial 


~y at Th | 
*?* > ean ® 


-‘etty Eats . 7 feletiemmed 


ona fT 


. 
2 
. b ; 
. 
e 
. 
* 
, 
’ 
. - 
-~ 
‘ bd = ) 
a 
“4% 4 


F , i 
¥ 
4 7 a ee a ee ae. 


law students. Hach is a time test. Grading keys 
are available. Cost: 30 cents net. No 
information relative to the validity and relia- 
bility of these tests is available. 


5. Rational Objective ests in Commercial Law, Gregg 
Publishing Co., New York, New York. 
Four final tests for use with any text. Cost: 5 
of any one test for 10 cents; Specimen set, in- 
Gluding the four tests, teacher's key, and 
instructions, 10 cents. There is no available 
information about the validity and reliability of 
these tests. 


6. Commercial Law, Hariow Publishing Co., Oklahoma 
City, Oklahoma. 
Test I. Law in General, Property and Contracts. 
Test II. Negotiable Instruments, Guaranty and 
Suretyship, Sales, Personal Property and Bailment. 
Test III. Agency, Partnership, Corporations, 
Insurance and Real Property. 
| One form. Cost: Single Tests, 10 cents; 25 tests 
) and key, 75 cents; 100 tests and two keys, $2.50. 
No information in regard to the validity and 
reliability of these tests is available. 


III. Junior Business Training 
A number of good published tests are available in this 
Subject. Most of the leading textbooks now have objective 
tests that accompany them. The important published tests 
are the following: 


1. General Business Training Achievement Tests, H. 
H. Crabbe and Clay D. Slinker, South-Western 
Publishing Co., Cincinnati, Ohio. 

Right unit tests based on Crabbe and Slinker's 
“General Business Training”. Sold in units of 
four tests each, one unit including Tests 1-4, 
and the other unit Tests 5-8. Cost per unit, 
4 cents. While these tests have been prepared 
with considerable care, there is no available 
information relative to their validity and 
reliability. 


Eo 


2. General Business Training Tests, J. Raymond Smith, 


4 e 
“Be a AJ 
& ar i | oe 
2 ae 
os OW >) 2 3 
e 
we a a. 7 
ae ee © Se Ope 
+ ey . a “ 
5 we « unt 
’ 
» ae 
~ * s , 
” 
- 
- > ‘ 
- ‘ 4 - 
* » | ‘4 
. ** ¢ 
- oe 
<<. 2? 
rv . 
- 
2 5¢ 
’ 
. 


#% 


tegad 
cs 4 ; : 
ito Rare. 3 


fe gettid.: 


nots Laatotat Fhe 


i?) +267 F 
ty - anol: By. a 
tee a) Te <tr 
rea S| S6i .0kS 


t ae ‘pe ‘¢ 
isalt ts0% 
6f0, [ae to " 
oe Ati | ee t hee hh 
ees asIELOU LY A 
noltonutten2 
ne rranyc i 
> , ¢ > ~ > 
» vw . >) 


‘oo Anns 


A ROG & +d 


fo Fath : r) 
LAD hie eee 
F ‘ T b sat? 
ede au s 
LJ - ~ te d de 
pe 
. L —* 
Pay fe Os ae ee 9 
a ee wn 
2% 
. + | ~ oa § 
: 
ae 9 
ar Paks a f% 
od ; 4 , 
e >. SA 
° : 3 
> es, R = 
ie Y 
‘ s Ww « - * 
¢ o on ail 
OM Pe aw &Oke @ 


Bcieti res 


a a nn LL lr! 


\ an 


5 


6. 


South-Western Publishing Co., Cincinnati, Ohio. 
Eight unit tests based on Crabbe and Slinker's 
"General Business Training”. Two forms. Cost: 
16 cents per set. No information relative to 
the validity and reliability of these tests is 
available. 


Junior Business Training Achievement Tests, 
Frederick G. Nichols, American Book Co., New 
York, New York. 

Bight test units based upon Nichols' "New Junior 
Business Training". (Part 1) Cost: One form or 
set, consisting of 15 copies of the same test 
unit, 12 cents; 8 sets, covering the 8 test units 
and assembled in one package, 88 cents. While 
these tests are considered very important in the 
subject of Junior Business raining, there is no 
available information relative to their validity 
and reliability. 


Objective ests in Business Science (Series A), 
Lloyd L. Jones, Gregg Publishing Co., New York, 
New York. 

Twenty-seven unit tests, 4 semi-final and 2 final 
tests, based on the contents of "General Business 
Science" and the accompanying student's work- 
books "Projects in Business Science". Cost: Unit 
Tests, 5 of any one test, 5 cents; Semi-final 
Tests, 5 of any one test, 10 cents; Final Tests, 

5 of any one test, 10 cents; Specimen set, includ- 
ing one copy of each test (33 in all), one set of 
instructions, and one set of keys, 50 cents. No 
information relative to the validity and reliabil- 
ity of these tests is available. 


Objective Tests in Elements of Business Training, 
John M. Brewer, Floyd Hurlbut, and Juvenilia 
Caseman, Ginn and Co., Boston, Massachusetts. 

Four tests, Series I-IV, for revised edition of 
“Elements of Business Training". Cost: 25 copies 
of any one series, 48 cents. While these tests 
have been prepared carefully, no information exists 
relative to their validity and reliability. 


Zu Tavern's Business Training, Cass, a textbook 
test, Commercial Text Book Co., South Pasadena, 
California. 

No information relative to the validity and 
reliability of these tests is available. 


Ter 4) 17 : é xs 
; <4 ; Pp) Pray ~ ta - 


am be 
, . E ‘ bs 
‘ les) ssenteacd™: lexene r 
7 a! 
P an 
~ : . 9 SS “eq ie a. of 
‘ : ert i am % "2 Bice: fe sae 
7 
tp 
olde lta a, hs 
; ae 
. a he ; 
- - _* FG BOLLE solnak 
> - a4 
. aO.4 Va scx 2O ire bert 
e TY, we fe an 
. she ‘Oca a ee & 
: ae ae ao? se 
f rG Jins teeF Fiuia 
, . ‘ GeAcc * te cy 
. 4 wee EP SA aw Me be 
: isd BO pees 
he a Sa e ay 
7 “ 4 a 4 J #6 ved £0 q 
a . oe — 
* a : ib Ha DSLERSS ie Pie, 
2 biaed seen? 
» i . 4 
€ i st-le Fost dre 
| . 
, ; migitnk eids liars 
, es Peres 
t aX : Siiet CaS 
~ 
oath be 
, Go ; t} ay Si ie) ie &S8t,09 
~ > a » #%, 
. : 3 3°3 Ue ote iroid 
‘ . 
uxoY well 
ei hl 
e re ied se Rv ie 
a * 5 a I AaC ’ Sy RBG 
# fs 
na. “oops fee 
" F 
P _ . > ean 
. 4 Ve aa. 6 y ei,9 | ie Sxoor 
~ . t 
P 2 - 1 4 - ue af %. 
: r z * 
7 ui nd 
¢ " e cy 2 
> ~ ac a. ry 
4 t OF" \ TRS a ‘s 
of “ = . = #4} 
c - PAS a a sO OO. Bbw 
- “ — , 
: ; & ,eOnoiy CRIs SAL 
= ¥ - ; coae % wv 
- a > <A oe ert 
4 *% ad 
= : SEE Catered 
7 , ‘ » 
y as 7 ~ 
= ‘ os b of 
: ¥ 
- ! ‘ i 8 oe mis ? 
i pa, note pemkead) Foes 
s A p og . > 4 rc 
- By re. 2a ~ouwtSa 100 4 het 
- ; s — ae. ad a 
i £0 B9Hem@eia 
¥ ~ — hr, 
‘ 4 7 i <) 4+ eo LO ry 
: ¢ Thjes gd eed ote: 
. > re eel bP ay F wes 
« ; . . Ty AV] = a 
hal 
> Ri f 
- ae ’ . . a 1. vf} i a 
i e* f a gal “ a Sc Bn mxowet ne al F 


‘a . pes i{slovregmagd’4¢ seos> a 
: " wlntonh ego 
' y : : q . C Via ot oo" Pye ¢ fur deel sad 


* 
- 
' , get 
* ‘ is i ¥ so W.Liv 


EO ————————a eee Uh TCS!|™C<CS~é SS 


IV. Commercial Geography 


The field of commercial or economic geography still 
offers a rich opportunity for test-makers. Several tests 
of a very excellent grade have been published; especially 
noteworthy are the tests py Dr. H. O. Lathrop. There is no 
Single set of tests that cover all phases of the subject so 
far as the present writer has been able to ascertain. Dr. 
Lathrop's tests accompany the Ray H. Whitbeck textbook 
"Industrial Geography". An examination reveals that they 
are based mainly on the first section of the textbook which 
deals with the geography of the United States. 

A summary of published tests includes the following: 


1. Tests in Lathrop's Laboratory Manual in Industri- 
al Geography, H. 0. Lathrop, American Book Co, 
Sixteen unit tests, divided off into "A" and "B" 
levels, based on "Industrial Geography” by Ray 
Hughes Whitbeck. The best published text-book 
test that exists in Commercial Geography today. 
There is no available information relative to the 
validity and reliability of these tests. 


2e Tyrrell's Geography Exercises, James F. Tyrrell, 
The Palmer Co., Boston, Massachusetts. 
Pifteen completion tests ranging in length from 
40 to 60 items each on the geography of the world. 
More suitable for junior high than senior high 
use. Cost: Complete specimen set, 20 cents; in 
quantities for class use, 1 cent per test. No 
information about the validity and reliability of 
these tests is obtainable. 


Se Industrial and Commercial Geography Tests, John 
W. Morris, Harlow Publishing Co., Oklahoma City, 
Oklahoma. 

Three tests. Numbers one and two contain three 
parts each; number three, five parts. 

Test I. United States: Location, Production, 


we ~ AS whee 
S01. BS 20s Fteggo A2size 


Vc -Ofip Te $ael Léoxe 
‘ td = 5 is 7 & 
. . Ve win @ - a * 
* 4 t 
‘a 4 Pao: 
> - 7 ‘ . a ~ 
. Heald 
. 
/ is z 4 4 
- ¥ ‘ + ~ 
" : Bit Lt 
® pigs 493/,°. 
+ > on 7 * aet 
ae Av LS Vo we es 


- 


ragqe:3633 sdv: Adiw ete 


Industry and Commerce. 

Test II. Latin America, Europe, Asia, Africa, 
and Australia. 

Test III. Map Locations 

This test series is especially adaptable to 
high school use. It was prepared under the 
auspices of the Oklahoma Council of Geography 
Teachers, 106 pages. There is no available 
information about the validity and reliability 
of these tests. 


4. Geography Trade Problems and Practice Tests, 
Lylyan H. Block, A. J. Nystrom and Co., Chicago, 
Illinois. 

Tests based on Nystrom International rade Desk 
Maps; 18 test series, one for each of the impor- 
tant products that figure in international trade. 
Well-planned series intended primarily for high 
school use. There is no available information 
relative to the validity and reliability of 

these tests. 


5. Witham's Standard Commercial Geography Tests, 
Je L. Hammett Co., Cambridge, Massachusetts. 
A series test intended for high school use. 
There is no available information in regard to 
the validity and reliability of these tests. 
Ve. Economics 
The paucity of good published test material in the sub- 
ject of economics is especially lamentable. Professor Inglis 
states that He subject was taught.in Massachusetts as far 
2 
back as 1821. Considering the long period of time over which 
this subject has been taught, it is amazing that greater 
strides have not been taken in the development of published 
objective tests, The one ray of light in a gray sky is 
afforded by the American Council Economics Test, details for 


which are given below. 


1. American Council Economics Test, Horace Taylor, 


= 2 oe oe OF ee ae oe ee ee mF oe Se OF Oe Om ee Se ee ES Se ee ee SS SS ee ee ee Se ee Ge SE ee ee Fe ee ee ee ee ee ee ee ee ee Oe ee 


(2) Inglis, 4., Principles of Secondary Education, Houghton- 
Mifflin Co., Boston, 1918, pp. 187. 


ie in 


4 Y — 5 7 
> of? So Bet iqguva’)§ 
” , ore fe 
* ~~ > j = bia, “rr 
Ls @ 4 Ge ae a 20% ~Osalce + 
> * . - - nil bw bs wee fT 
fs . . B E02 NELVRR In. 


; fox shard ydqgeusceb 
4 : ey ‘ty gee Dn FT 
© ** - tee . je eh shi ‘ i- Git. 
- fant rey 
_— es 2S i ee 
* . f sD us oe EY” 
‘ iu =e woe AG aw Bod 
= = * el f 
-~ ol 4 a 
F < D.” ySeRs 40 JS» 4 805a 
my a, teat oc 
> 5 i . af J ae * 
- ~ r q 


‘ - 2 . “ - ._+> - ’ ae 
-s > 281 , os L sci $ EF) 
_ P iF » ee 
a are oe FV ne Dhe” “e 
, 6 2 
. ; in i iLte6e A 
Fi > ere “ 
ashe a — 
7 “s SF + i E a ~ t+ 
zs 
= - t f 74 
= ; L A's 2LA BN Biv 
-. Pon 7.) 


(2.0K ee 
bieood [L880 aaa 


a o : : me, 
. ts ols ee (gone nage Sado 


’ . ‘ my’ a ete 


——a 
* , , 
Ge aad 9 


F 


. we OP pee r 

A ¢ ~ “ ae + 

ca weks LEG. L£LSRSOO tind 
: ~ 

eS _—- ee ee 


? ‘s ire: 


T. N. Barrows, and Ben D. Wood, World Book Co., 
Yonkers, New York. 
Two forms. Cost: Package of 25 or either forn, 
with manual of directions, key and class record, 
$1.30 net. Specimen set, 20 cents. Tentative 
norms have been established for this test. No 
information is available relative to the validity 
and reliability of these tests. 
VI. Business English, Business Organization, Salesmanship 
and Advertising, History of Commerce, and Banking 
So far as the present writer has been able to discover, 
there are no published tests in the above-listed subjects. 
This is due in some cases, at least in part, to the nature 
of the subject; in others, to the unimportant position that 
the subject holds in reference to other subjects on the 
curriculum. Certain published tests do cover part of the 
subject-matter of one or another of the above subjects. Con- 
Sider, for example, the subject of business English. There 
are two published tests that are of value in this subject. 
The first of theseis, Leslie Clark's “Letter Writing Test" 
and the second, D. D. Lessenberry's “Tests on the Parts of 
the Business Letter" published by the L. C. Smith and Corona 
Typewriters Company, Inc. In reference to the subdject of 
business organization, H. G. Shields’ "An Experimental 
Business Backgrounds Test" has much to offer the progressive 


teacher in the way of suggestions for helping him prepare 


his own informal objective tests in the subject. 


ry : 
a oe 


‘ors 


.eteesde “to- ef 
ri st Sok 10 .& , 6000 a ode: 

; | 

= ty aki latidsg “tedvea Peontendl ads 
| . ts 


- se ee 
=a 4 onl ,yotqeio? stevliweayt 
i< | : ‘x i? es 
ee ; * .<H ,neiveziagaite been Se 
a - a raph = . 


oe rT =: ot saolfeesgage To yee edt ih ume 
ae Pee a, m 2 ~ ae 
aot ; g i. atae? ervilcetdéa tamxotal 


ad! tan) ee cis 


> 
) 
) 
‘ 


a 


OO —— eS Se 


a ee Oe 


I. METHODS OF CHANGING THST SCORES INTO GRADES 
I. The Marking System 

There is a tacit agreement existing among most teachers 
that school marks and a definite marking system are a good 
thing. Most members of the profession accept the marking 
System of the individual school in which they are employed 
without demurring in the least. Of late years, however, it 
has been brought out quite forcefully that certain crudities 
do exist in the individual marking plans, and that some 
marking systems are distinctly superior to others. Exponents 
of “progressive education" hold that marks are not necessar- 
ily a good thing; in fact, they would favor a complete 
recasting and reorganization of the marking systems in common 
usagee Some well-known schools that have been organized as 
progressive schools even have gone to the point of eliminating 
the five point scale of marks (A, B, C, D, and X) and of 
Substituting merely two grades, viz., pass or fail. 

At the outset, the present writer wishes to emphasize 
that he does not favor the abolition of marks, but, rather, 
favors a definite marking system adhered to by all the 
members of the particular school. It is wis contention 
that school marks have decided motivating values that 
Cannot be ignored. While it is recognized that an extrin- 


Sic motive such as pupils working for marks is not 


a” ~ 
. 4 al 
ij t 
be ~pige “oo 5 
L mete 18 LCP. 


del S 
@ 
~ 


Evy . q bes ic ta exit tata toa 
~ ¥ r a. FIs): cf, Lae 
. ota nt foonos . ieabietbat- sas 

i eet, es Fae Bit 
tt wate .: . tsast els a) golersuee pont 
- es es } ey sore feed 


= + P wr } x e + * va 5 - >; 
¢ fiat cso tigp ID aoe he eu” 


= Oo ~ a 
7 a 
; ~ - 
. > ow : 
F ° - 7 ice Terr 
— - a ae > « 
0 
ry 
‘ 4 r J 5 
af ~ ¢ 
} — es «| 
~*~ 4 * ~ e 
| - 
, 
7 


ie 
; OH <i ie! OVE nwond~Liow. ono8 


re eee ee 


Cig ™ tr | phazy Ow clea Bel 


lod 
f 
” 


© 
- 

. 
~ 
) 


a as 
RGF 
» 


9 npeteya palasec ctkaeee 


a | 


L. #2. .Toaton xolsolsasg, edd Fo. 
ra) “ 


necessarily a good thing if carried to extremes, yet, 


the mere knowledge that a scale of marks is to be used 
will have the effect of stimulating many pupils to do 
their utmost. Odell, in common with many other author- 
ities, holds that school marks should be Sihsusais* tae 
must be remembered by teachers that school marks are not 
at all times a complete motivation for a new unit. No 
effective teacher deviates very far from the principle 
of trying to motivate his subject by developing an intrinsic 
interest in the subject-matter itself. The motivating 
Values is only one of the purposes for marking. Crooks 
says, "A brief yet inclusive summary might mention pupil 
and parental information, guidance, classification and 
certification, motivation, and measurement of educational 
efficiency as the primary aims of all marking speasnaene 
No marking system can be entirely adequate until it 
has been carefully defined. A particular marking system 
in use in a given school should be thoroughly understood 
not only by the teaching staff but also by the pupils and 
their parents. In many cases, the teacher assumes that 
parents understand the marking system when, in reality, 
they are entirely oblivious of even its fundamental 


elements. This situation compels the teacher to familiar- 


ize himself with the school marking system so that he can 


(1) Odell, C. W., Educational Measurement in High School, 
pp. 459, The Century Co., 1930. 


(2) Crooks, A. Duryea, Marks and the Marking System: A 
Digest, The Journal of Education Research, Dec., 1935. 


ee 

ke ais. 8% icd¢ etiam Loudse- sens abled 48 BE 
ale 

stored gd Se sroduoaet od ¢ 


Rees: ae 


at ‘ 


We 
= 


gout n oTefomto s senlt its 


7 _* se i> 
“2 OS. Uns Te 

- p . Pn 
Lie © > = L_~* 


-— > «ee Me 4 Ps ’ 

Viegvicin ¢6y seike A® 

Io ,eonsiius ,neivamiernt i a, i W 
. ame 


a iyi tom Ar 
Pptiteae .o* sg ke 


; tae 
Tq. ef J 62 youslcitte Jae 
: mA ~ » es 


Solr " ae 


i : £ Sith 
oa 
Y 4 
E : is od? 
« ‘ ~ D s ea ox 18 ” 
vp) eee 
> 7 o “4 
ee : : E 
: : hue 
: : a PE EOE ASABE 
qi al shistetsessesl 


tmecéey? salstell edt? Bre ested % pot 
# . \ Cad —_— > 
, 160. {~f Stsetgs nett go mi 


< 
7 :. ne? 


explain the logic of it to parents who make inquiry. Ruch 


says, "After all, any marking system is arbitrary. Without 
definition it can have no Bas edenrabee members of the 
teaching staff are in doubt as to certain aspects of the 
marking system, it is compulsory that they consult the 
school executives in order to clear up the situation. If 
a@ new marking system is being planned, it is always good 
policy to allow the teaching staff to participate in the 
actual work. The teachers would then have the opportunity 
to challenge anything that seemed illogical. Ruch says, 
"Almost any scheme of recording marks, provided it be 
adhered to by all teachers in the same school or school 
system, and provided further that it be understood by all 
concerned, will prove adequate if there is a valid and 
reliable provision for the measurement of the relative 
abilities of the pupils to be syaasee 
II. Absolute versus Relative Marking Systems 

It is undoubtedly true that there are many hundreds 
of different marking systems in use in the United States 
if we include all the minor variations of the common 
types. All marking systems reduce fundamentally to two 
types, viz., systems based upon absolute standards, and 
those based upon relative values. Odell concludes that 


(5) 
the former are much more common. 


— oe ee ee ee oe Se oe we Be Se Se ee Se ee ee ee ee ee Se ee eS ee ee ee ee ee ee Se ee ee ee ee eS ee ee eS SS we oe oe ee oe 


(3) Ruch, G. M., The Objective or New-Type Examination, 
pp. 376, Scott, Foresman and Co., 1929. 
(4) Ibid, pp. 377. 


(5) Odell, C. W., High School Marking Systems, School Re- 
View, Vol. XXXIII, 1925, pp. 346. 


r . a 4 > : w 

Joni Veibent exean oov etneisg oa 

-s “ se, . 
: <A ignalie ~ ey, = 

es PAL 
‘ {+i : : gore Ft nee i _ * . i } A, 
fwontd veertigts bl moteye gatas ad : 

» | . : . * 2 
- 1 i oe - A as 7 eg 5 ‘ 

~ - | J 


r — fF + 1 - ne se ene” 
if) i ) Olu mn . i428 2 
£ a4 wey. 
B25 ‘ SO .O¢° 28 S) HGR tia ore Theda; 
: ny ‘ : 
iz steels 2m el. wi ,perexe 
é : imi A, be i =. 
bry: ey be ox opie fa sehTR at Bovlinoeze, fo | 


> 
z 
a me 
’ 
* r i ‘OF 
, i: e 
r 
£ a¢ 
~ - mw Cane t 
- 
bs * 
< 
iu - 


. ~ 
o 
a*? 
=): 
ay 
A. fi £ 
A. 
n= 
7 r ° 
~s + . . 4 
J 
7! ‘ - . t S$ v 
; tcait sshileso® IfeiO  s.tebiav evivetes oan, 
= - . ¢ »* 
oe ‘ a pA ie 2. ¢ ' om 
. Seperate pap EN ping TARA ANE ch Ee 
~ 29 [ Be 
mites a ify ad 
es > weh..s9 svitoetdd ad? _ 
: ad mY * ‘ rT 
Of ,.00- pam semeeiot , 


: wa ee ire 

4 3 
aye: gniave feodet 
Bd Ss. ot: 7 . oae or, 


fg) FOS a's 


The first marking system consists of the percentage 


grading plan. This plan presupposes the grading of papers 
on a scale ranging of O to 100, in all, 101 levels of 

pupil accomplishment. The chief advantage accruing from 
the use of such a plan is that teachers are familiar with 
it. Ruch points out that: "The greatest weakness of this 
system is that its sheer familiarity leads to an uncritical 
acceptance without conscious regard to the inherent assump- 
eS ae uncritical acceptance of the percentaze system 


implies one of two things that follow: 


1. That the examiner thinks he can distinguish 
between 101 levels of pupil’ accomplishment, or; 


2e that either due to ignorance or inertia he does 
not care whether his marks are misleading because 
they indicate an apparent accuracy which really 
does not exist. . 

Absolute systems of marking assume absolute standards 
of judgment. They presume to attempt the precision of 
physical measurements with their very definite units such 
as hours, pounds, etc. Marking systems have not been worked 
Out to the extent that fine distinctions may be drawn 
between the relative accomplishments of two pupils who for 
all practical purposes have attained the same degree of 
mastery in a specific subject. For example, suppose that 


the percentage system is utilized in a given school and 


that one pupil is graded 86 and another 85. It appears on 


= oe me oe ee ee ee ee ee eee ee ee ee ee ee eee eee eee eee ee ee ee ee ee oe 


(6) Ruch, G. M., The Objective or New-Type Examination, 
pp. 370, Scott, Foresman and Co., 1929. 


if5e2 


sia a ‘ fo me 


a 

a 5 

‘ 2 | 

7 . x < - w ig & 
~ Cee! a 

. & Firs i i t e wii > 


~ - - > » ° > 
be ‘ = é & x ‘¢. . J sai? / + en ane, ern og 


j if Fe wooo tsoddiw en 
{a} 
: >O'I O¢ mm OE ponetqoogs. facia isecan eAD 


© 
~ rm oe 4 - a S 
Ls , ae i ’ 5 Sih aa Fe) ’ pas) 
= 4 “ d « ~ . t > +n, 
ay Lt 3 aie te 3 £16 
ta 
; . ro ae 16e87 & 
¢ ~ > > . aa .- 5 _— ~ ~ 
x Ul ‘tT Gs i 5 ~oii. & 7 Sf . 
Pi ro . 
J oe a 
+4 isi a Sh P2169 < OF. 
| : ' ‘ : »2 Lbadi ws Pa 
j lS Us hous, Pas Ysa 
. , - i hg < 
2 JRLIANS, FOR BEo 


r . or as « ? "+ £> a 
P F } 
. ?S ~ . 7 =o s x 
ae 
ie 2 ae ie 
re < ery He ia eia 
al “ 
» 
: . " t NAME Tes 
- > . ~) kt Se ade TA ee 
i] 
a 


= : Jee gotige .ete/,S6Rs0g)l, suman 
; = Nis 

ae ot LB oe tect Teetxs edd ot ts ‘t 
<a et: 
: rath -O8 iiquoces evidelex ead nee 
POLY; 

Z ares § : a3 ‘ wad seacanene ses 


x - 8 209 Aa ~~ 


= 

bs len? apogee jetguemé teh ~ .coettise) off eeeqea bat ® M2 
ae ; ' ¥ Ay 
es . wiin se Seciiicc af ued aye: te 


- Pp : .. a we 
raat f 28> ve 5 Cts bas 65. 9 ale al” itg 


— rn i ee re Ao ae ene 


——— . - — ot oe 


Se ae tS 


f 


the face of things that the first pupil is superior to 
the second. The one basic criticism to the above situa- 
tion is according to Ruch: "A large body of experimental 
evidence points to the fact that from but five to seven 
levels of ability are ordinarily recognizable by teachers 
in marking pupilSeccccccccsccccccccee The difference 
between an 85 and an 86 is a difference at least five times 
as fine as the human judgment can ordinarily dai tao tai 

Another potent criticism levied against the percentage 
System is in reference to failures. If the percentage 
system is in use, there is one percentage point, known as 
the "passing" mark, which discriminates between passing and 
failing. If the passing mark is 70%, a pupil: receiving a 
grade of 69% does not pass and consequently, must take the 
course over again. It is no defense to say that the line 
must be drawn somewhere. Some educators refuse to pass 
pupils unless they have mastered the subject-matter completely. 
Carlson states that: "It is absurd to defend a fixed passing 
mark which permits some poor pupils to carry on and forces 
others who are but infinitesmally poorer to repeat the 
entire course from the Sane 

The second general type of marking system employs the 


idea of the normal curve of distribution. Professor Max 


(7) Ibid, pp. 373. 


(8) Carlson, Paul A., The Measurement of Busines 
Education, pp. 25, South-Western Publishing Co., 1932. 


crete 
ee 


P1o 


~| 
* 
et 
~~ 

4 

A 
v 
ed 


cYrecvouoe. i108" 


Ee I" adi eeteta nosiyed: 


: ae 
t sy emoa etigiag mold sia 
a We 
t iexce ilemeetinitak 2nd ete od | exes o 
; ‘ 
F inclined eft gens episoo™ 
| is 
} {item *o eqy?. fezgenes boooes tee 
. 3 R { ; o oO evizge git". on oc Lae i 
- 2 _ oe —_ a > "ve -T . _ er ae $ ; - 
} ' 


Meyer of the University of Missouri was the originator 
of this method. According to his plan, pupils were to 
be marked on a basis of relative, not avsolute achieve- 
ment. Chance or probability is the underlying logic. 
It has been noted that chance phenomena tend to form a 
Symmetrical bell-shaped curve. In biology, it is known 
as the biological curve. In statistics it is known as 
the normal frequency curve. When applied to marking 
systems, it is known as the "Missouri Plan". 

The normal distribution curve is a useful guide in 
the changing of test scores into grades. Most human 
traits, including mental ability and school achievement, 
seem to approximate the normal distribution curve. There 
is an accumulation of cases about average ability. This 
accumulation clusters around the middle part of the curve. 
From this point the curve slopes off toward the upper and 
lower extremities, indicating a diminishing number of 
persons as the distance from the average increases. If 
the test grades of a large number of pupils are taken, 
the approximate distribution as shown by the normal distri- 
bution curve should de expected. 

Based upon the normal curve, it is possible to compute 
the number of pupils who will fall into a particular grade 


group. Mathematicians have computed in reference to a 


on > . , ~ oo 

eet. to snlanedo oar | 
RS ty 
ih by Lot etter” 


4 


txongas. of) moon 


. A . : 
‘ ‘LATS aa} =— ¢ : ais. wa] 
; = ws GM 4, pk 
“ee ev a “ ota 
. - al - 4. * 4 . 
oi 7 ~ i had, 1-. “ - - 
~ &» be ~ + = PM hey, [v6 


: fale eat en encvereg 
| ve 
si #8 To So2L279 tee? brat 
L J 2 i b >< uv tents . Dadhik: § edt 


P » od bigods errs nohtsd 
sid cogs Bese} 
ef ome 0 amit « 


£ edt at eteasdi ait wy. 


five-point or quintile system of grading that 6% of the 
pupils should be expected to receive A, 25% should receive 
B, 38% should receive C, 25% shonld receive D, and 6% 
should receive X. In a six-point or sextile system, the 
proportions would be 3% A's, 16% B's, 31% C's, 31% D's, 
16% E's and 3% F's. In practice, it will be found that 
there are many variations of the proportions that have 
been stated. 
-III. Methods for Changing Test Scores into Grades 

A common fallacy in written examinations is the 
failure to differentiate between a test score and a grade. 
Some teachers think that the terms "Score" and "grade" 
are synonymous. It is highly important that this error be 
avoided. The score is the number of points a pupil makes 
upon his examination. For example, in a new-type examina- 
tion consisting of one-hundred fifty items, the pupil 
might get ninety-eight. His score would be ninety-eight. 
The pupils are not particularly interested in their scores. 
They ask, "To what grade is ‘ ninety-eight equal?" Lang 
explains that: "The grade is the interpretation of the 
score according to some standard or ES RE Ae grade 
is intended to show the pupil's relative achievement in 


respect to that of the rest of the group. 


(9) Lang, Albert R., Modern Methods in Written Examina- 
tions, pp. 246, Houghton-Mifflin Co., 1930. 


are 
£ 
ar Puy - 
ae ‘ 7 ¢ 
Ss "44. te a 
- 1 oh a ee ee 
. 4 Le ad i a 
a V\ bh er an J 
Se eS 
or 7 
— a 
fin 
4 oe 


(ifaitp to. tiLeqeerlie.. 


{29 Od binots eliang 


vinoes Eluode BB 48 


inerettibet eubfie® 

ild? exyendoses emed 
f ,SucarynROSYS ete 
robe ent »S5eilova 


bxe eld 0Gge 


) , 288° Yen. 
¢ ity’ " | ! AJ 2 
Fig Sie @hl Bigs 
= ry ‘hie Oo oe «Hg 
E oT in LP LVUOOCS . SlcCie 


vorte ot Ssbnetal ef 


to ven? of Posgeet 


"Sone Oe ow ee oe re 


2 SvedLa ,poml (8) 
,eo2 .g@ ,enol® 


ee, Pe ee 


a. Proportional Method 

Many teachers are using this method for the conver- 
sion of test scores into grades. According to this plan, 
the test papers are arranged according to their scores 
from the highest to the lowest. The next step consists 
of computing the number of papers that will fall into each 
grade group. If a five-point system of marking is followed 
in the individual school system, the proportions of 6-25-38- 
25-6 would be used. ‘The 6% best papers would be marked A, 
the next grade group consisting of 25% of the papers would 
merit B, the middle 38% would get C, the next grade group 
of 25% would receive D, and the remaining papers would be 
graded X or failure. 

There are a few warning notes that should be sounded 
in regard to this system of marking. First, the normal 
Gurve is only a guide, consequently, the teacher will have 
to make some subjective decisions in the conversion process. 
Suppose that a pupil is a border-line case and that 
according to his test score he might be placed in either 
the B or C groupings. The teacher must review in his own 
mind the other work of this pupil in order to decide to 
which grouping the pupil rightfully belongs. As this is 
a subjective judgment, there is no surety that an absolutely 


fair decision will be reached. Secondly, the application 


a2 at 


~+ » 
Gag. og: 


LsSamen 


“i818 8te Brcqe 


Feornal 


of the proportional method to the scores of a particular 


class does not insure perfectly fair grading. Consider 
the situation when a class only consists of fifteen or 
twenty pupils. If a teacher follows the normal curve 
of distribution mechanically it is quite possible that 
Such a procedure will be unfair to the poorer members of 
the class. After all, the proportional method works best 
where large numbers of scores are being converted into 
grades. 

b. Sigma Method 

This method is one of the best procedures for changing 
test scores into grades. Lang says, "This method should be 
mastered by all who wish to perfect themselves in the 
examination pale: ae first. the statistical work 
involved will cause trouble ts teachers not grounded in 
statistics, but gradually as they learn the procedure, they 
will be impressed with its reasonableness. 

The sigma method involves the use of the standard 
deviation in educational statistics. The standard deviation, 
abbreviated S D or 6, is usually explained in reference to 
the mean. The mean is the point or value around which the 
scores tend to group themselves, whereas the standard devia- 
tion represenis the distance from the mean that scores tend 


to distribute themselves. In working out problems on the 


= ee ee ee ee ee ee ee ee em me em ee em ee ee ee ee ee ee Oe me ee ee oe oe ae oe ee oe 


(10). Ibid, pp. 254. 


ong!) oat 


a 


,o 16.22 


Oty. ph re ont eat 
quers ol. doe: 
2-1 ae oe 
edz: Be in 


sigma method it is considered easier to calculate the 


standard deviation from a guessed mean by using the formula: 


sD 4/ Sum of a _ (sm of ug 
N ae 


At this time it would be appropriate to consider 
examples showing the computation of sigma. An explanation 
of the two common methods for figuring the 5S D will be 
given in the following pages. 

In the following example, it is assumed that the scores 
are for a class of average size, viz., 35 pupils. The test 
from which these scores came contained a total of 46 items. 
The spread in the scores, the difference between the highest 
and the lowest, is 38 points. The greatest concentration 
in the scores is around 25, consequently, this point will be 
taken as the guessed mean. The (X) column represents the 
scores ranging from the highest to the lowest; the (f) 
column, the number of times each score occurs. The amounts 
in the (a) column are computed by subracting the guessed 
mean from the score and indicating whether the difference 
is positive or negative. Amounts in the (fa) column are 
Obtained by multiplying the amounts in the two preceding 
columns together. Amounts in the (fa) column are found by 
multiplying the amounts in (f) column times the amounts in 
(a) column squared. The remainder of the problem consists 


of making the proper substitutions in the appropriate 


ha 


et ee me 


Pare —s ae “ 1? wane “ols 

Fie ia: 5 thi J Gil 0 bP RSS ani Wo, iF -3 
c é Ue en 

wn 


10] vihodees nema e a 
«8°49 Miss oLiot ade: ar 


44 Tctor.e a) aut wo L tox eid, aT 


, «hl? 488 le Seaeigye 9 


_~ OF FP 
Gu GS Ga 
‘ : 2 do Ts 


iv06|lUneam besseng ens ee “ne 


std edt mos gates LST: -ger0d 08 4 
<a - 1a, 


‘lu ie x<etmma. eis i! 


’ 
ne stove edi’ mon 
. ot e nok 
aay feasn 30 evisteeg & 


“A aes 


bua edit atyigad Lom ye fe 


Ll etasoms, sent 36 


athens eis 


formulas and then working them oui. 


Once the mean and standard deviation have been 
determined, it is comparatively easy to ahatige the scores 
into grades. The mean of 24 is taken to be the middle of 
the C group. The standard deviation of 9 is taken to be 
the number of consecutive scores to be included in each 
grade group. 

The C group will consist of the score 24 with scores 
above and below. As there are 9 scores in each grade group, 
the C group will consist of 20, 21, 22, 23, 24, 25, 26, &7, 
28. The B group will include the next nine scores above 
coe @ wrOne, Vit., 29, S30, S31, 58, 335,.34,-.55, 56, 357; the 
A group, the scores just above, viz., 38, 39, 40, 41, 42, 
45, 44, 45, 46; the D group, the scores just below the C 
Seuees Vena, 2o, 26, 17, 16, 15, 14, 13, 12, 11; and the 
(X) or failing group, the scores from 10 down to - 

Even pupils entirely unversed in statistics recognize 
the reasonableness of this method. The greatest criticism 
that the present writer has ever heard directed against it 
by teachers in service is that it takes too much time. 

Many teachers prefer the proportional method to the sigma 
method just because of this reason. 

We are now ready to consider the second problem on the 
computation of the 5 D. The method of using ungrouped scores 
has already been explained; now we shall take up the method 


involving grouped measures. 


es 
Tebiwwe a 


Ww « a 
—— > 2  's — ot Teg! A 7 
Pe Mam % 4 SLY Ue thy ineounon. Xo 


So “4 
’ 


<= aiee Ae Ol D8 ( SB teo¢ 618 64647, 8 po 
. 7 wre eee Paes he BI 


elenec Jilw? 


4A? 


Bowple Lilust rating ana Computation ot the 
At avdaxd Deviat Loo — Lina rouped Scores, 


M= GM , Se .28—s— te 


Seo .TIR = ol. 114 


bog \ pesos 


Ul 
| 


iy 
X% 
G 
k 
? 
N 
| 
) 
‘S 
NY 


| 
ah Gy Sik of d> ) ‘ 
Jum of 4) ; 


od ae L}\ustrating the Computation of e 


Standard We hes vO v= ened Scord 


P Pied Number vehach ee 
men. chats, ° 


= Yo + C 100 yer 
re if: Gq 
Pa 8.0! 
Sp-i on = = i N 
=// Sa rca a (ae 


ee a. 79 


IN j.5f = Jost 
24 78 


Spa 5 


se 


» 


= 


* ene 6 


The explanation for this second problem is very much 
like that for the preceding one. The main difference 
between the problems is that in the first the scores are 
not grouped, but in the second they are. Grouping of scores 

must be resorted to when there are many scores to handle, 
and when the range is very great between the highest and 


lowest scores. In this problem, the range is 90 points. 


It will be noted, too, that the frequency includes 100 scores 


‘whereas the preceding problem merely contained 35. 

In grouping, it is necessary to plan for between 12 and 
20 rain es is considered best, also, to have each group 
consist of an odd number of scores so that the middle score 
can be taken as the midpoint of the group. In the problem 
under consideration, the range is 90 points. If this is 
divided by 7, the quotient is 12 with 6 as a remainder. 
The scores could very well be divided into groups of seven 
each, as such a grouping will satisfy the requirement that 
Professor Lang suggests. After the grouping, the amounts 
for the various columns are determined. Once the totals 
are obtained, it is easy to substitute in the necessary 
formulas and so arrive at the values of (M) and the (SD). 

ec. The Morrison Marking System 

Professor Morrison of the University of Chicago has 


long been dissatisfied with existing marking systems. The 


(11) Ibid, pp. 261. 


w 


aT ie 
} 

4 

i 

’ 

3 
[2 
7 
{ 

ag. 

: 

- 

$ 


system he suggests is based-on the idea of "pupil mastery”. 
He deplores the practice of passing pupils if they have 
evidenced a 70 per cent score on their written examinations. 
He contends that teachers are not justified in passing 
pupils who have not completely mastered the suodject-matter. 

His evaluation of the rank-in-class (relative) methods 
of marking does not reflect favorably upon them. Professor 
Morrison says, “Appraisal by rank-in-class is therefore 
badly calculated to identify and measure the real education- 
al product. Worse than that, it seems to have an essentially 
anti-educational TENG ENCYseseseeeeseseseeeesesccecscsceveres 
eee in the place of inward satisfaction in growth attained, 
of which the individual can be certain, it substitutes the 
restless ambition to surpass one's ea 

While the above viewpoint is logical if Professor 
Morrison's premises are accepted, the present writer prefers 
to accept the viewpoint of Ruch, Odell, Symonds, Lang, 
Carlson, and others. 

In the Morrison system of marking, the grade scale of 
A, B, C, D, and X would be eliminated. He suggests that an 
entirely different type of mark be given. His marks would 
not consist of merely one letter or symbol, but would 


include additional information. He says, "If we desire 


(12) Morrison, Henry C., The Practice of Teaching in the 
Secondary School, the University of Chicago Press, 
2nd Edition, 1930, pp. 72. 


_ 
“ 


efano dd of oved 


<~@ 


‘WS 
i 
; he on 
* 
«a, 
‘ 
‘ 
. 
? 


ov 
i! 
- 
t 
<] 
» Re 
ey 
tr 
° 
fp 
5 
— 
+ 
14 
sad 
~~ 
a] 
< 
7) 


ee oh 
4b 


cr ee, 
> 
v 
d 
‘ ’ 
' 

» 
pe] 
ie) 
© 
wt 
% 
ir 
® 


i 
~ 


bh, eel 
tT 
+] 
a” 
<a 
* 
i 
2 
: fee 
mo 


< 

.* ,» bse : | celiostetise brswat 50 ag efy ots 0} ar 
wee ‘ ; - 7 1 = ue 

' i aso fA) Sivtbal edt Abie iw to 
= Bis - 
= : : 7 ay f at . 
+ ; oi se2qmme of noli tidus eseise 

e 4 1 ae a ey 


. 


ro ; — ; 


Po | el hae 
¥ 


é 


ee 
” 
= 

2) 


Vv rte 


} , ies Jace .6328 € peRizgtet 29 a "aval: 0K 
“Be 


+ << PH ee i . Z 

, ’ : ‘. i gwelvy ad=d Hobs ag 
ee ae ee a 
* . SEGLF0 moons ahi: ~ 


A og 


i oP a7: 
‘MOSES Te - intl eat! a of ; 4, o 
7 = 


49 
D 


fog, a 
ae 
he 
bed 
i= 


> ¢ 
: 

- 

“ 

et, 
ie 
2 
4 
ms) 


2 


x 
Re P 
4g > 2 ANS >) cere y* 7 
ei: ry } 7h! J pe + “ Leetees <0 x 
i Si 
“1 al “y et " aa t - 
a rd ,Byse Bh -. taste aad fonott teas 
-_ ‘ 5 
. ~ . : 
. 4 t t w 
’ 
fe? ASE a 4 ae 


further a performance record, we may enter M110 N and 
agree that the expression means that we have evidence of 
mastery, plus a skill rated at 110 on @ standardized test, 
plus evidence of cultural iesaed 

ad. The Percentile Ranking System 

Much merit is claimed for this system by its advocates. 
Carlson calls it "the newest system of piuchduetsvaac stains 
to this method, the pupils are first listed in rank order 
and then these ranks are translated into percentiles. for 
example, suppose we have a group of 12,000 pupils taking a 
state-wide bookkeeping examination in Connecticut. The 
one per cent top papers, or 120 papers, would be given a 
rank of 99. The next 120 papers would be given a 98, and 
so on down. If a pupil received a percentile rank of 75, 


he would know that he excels 74% of the pupils and that he 


is excelled by about one-fourth of them. 


(13) Ibid, pp. 80. 


(14) Carlson, Paul A., The Measurement of Business 
Education, pp. 25, South-Western Publishing Co., 1932. 


a BE | a 
“ i 


ws ay « at ‘ 
< 
aioe] 


vy 


fei 


a el." 


s 


ie 
~ 


ae eg 


3.1 


7 Se ee 


= 


1 
- 
, 


= 
7 
h 

Po) 


eae AN : 


a pe ¥ 
; > 
7 


ay 
a Wie 


a 


<a 
ft 


Sy 


4 


oe 
ae 


_ 
> 


wr 
mJ q r| ‘ ; 
ot - hes aE 
H¢ » he knet. oely 
he a 


ag eas « , 


ey i >. mi i 
Bee). ie 


aS 
¢ 


a°eHtt” SP eL£i 


i 
J ~ Cie anit 
P a ak er aa 


‘ a 
| Toa 
‘ r — 40 ae 
is ind ~- 
to 210Ge gos 
; = + 


ay - 


w, A 
inow. 8a 
. ee 


ha qq? — yolt as ¢ 


i 


Je THE PROBLEM OF ABILITY GROUPING 

One of the most difficult problems a teacher faces 
is the teaching of a class composed of pupils of varying 
mental abilities. In schools where there is no proper 
Glassification of pupils, it is no uncommon experience 
for a teacher to be required to handle a class consisting 
of both dull and bright children thrown in together. The 
problem might be further aggravated by the inclusion of a 
few disciplinary or problem children, and some moral or 
emotional misfits. When the teacher is confronted with 
Such a hodge-podge, he must use individual instruction 
methods to a large degree. If the pupils in this class 
could be Separated into different classes so that each 
would contain approximately the same kind of children, 
the teaching problem would be simplified. Ability group- 
ing looks to the classification of pupils with approximately 
the same mental ability into the same class. Cubberley 
says, "What every school principal desires to give to every 
teacher is as homogeous a working group of pupils as can 
be Oat oe ce is almost an impossibility to get perfect 
homogeneous sectioning of pupils in the ordinary high school 
because of administrative difficulties. 


Individual instruction methods have proved a boon in 


= ee ee ee eee ee ee ee ee ee ee Se ee ee ee Se ee ee ee ee ee eS Se Se ee ee 


(1) Cubberley, Ellwood P., An Introduction to the Study 
. Of Education, Houghton-Mifflin Co., 1926, pp. 251. 


.9eTeeb oz asl - 8 


' + | ; d a 
om t vest ip etabh. havesseopud 
~ a, 


- - ~» 3 
S L673 OBL XO TIS. ts 
‘ rae a 4) ee ‘cof is icniad Bi 
el fiaqsg be fe eoittivrsiec tid -é 
_ 7 - 
-_ 
+ , ds pad . cetoes 
=“. 2 P a0) f OT hs TTtlioe Levee 
at : 
=< ‘ ‘ 5 " ‘ a _ P 2 
; gissixng Losion revs Pci 
—.. i 


i 
a 


Cts, naftioa 2 Ss JeaECMOR 8h nat 


bi ac 
1c 6% ¥ 7 Sm igoidpea oat 


] svtnbutl ta o a 
- = ry 4 


ake f 3 S¥SnN -BEsott or ods as t ie fant Lenk vida 


ns Oh, 
re. 
fe 
“ 
5 
G 
Ay) 
cr 


ee) 

ee ae 
a 

¢ 

» 

‘ 

t 

a 


wy 
| 
; 
¢ 
+ 
- 
r 


Php: 


; oe a we de ie —— . -- —— ee eR A ys 
=. ——, ue eh ; 
7 a ~ 


noLivonhoasat all oS boow 
ee ot x ALGAE, 


the teaching of ill-assorted groups. Ability grouping 
or homogeneous sectioning really aims at the same thing 
but on a much larger scale. Brueckner and Melby say, 
"Ability grouping is really a mass production method for 
adapting instruction to individual oe 
problem of how to reach effectively the dull children and 
yet not bore the others is a vital problem in many class- 
rooms. Albderty and Thayer suggest that: "Various plans 
such as ability grouping, individualized instruction, and 
the coaching of laggards, all have their enthusiastic 
advocates, but in spite of much experimentation designed 
to demonstrate the effectiveness of particular procedures, 
we are forced to admit that the problem of how to adapt 
the schools to the individual differences of pupils is 
still a vital aa 

Homogeneous sectioning is intended to render more 
effective the instruction of bright children as well as 
of dull children. In reference to the dull groups however, 
it has been highly useful. Dr. Baker calls attention to 
the following main defects in the general program of 
education: 

"1. The lack of proper psychological methods of 

instruction for dull pupils and for bright 


pupils, too, so far as that goes. 


(2) Brueckner, Leo J. and Melby, Ernest 0., Diagnostic and 
Remedial Teaching, Houghton-Mifflin Co., 1931, pp. 27. 


(3) Alberty, H. R. and Thayer, V. 7., Supervision in the 
Secondary School, D. C. Heath and Co., 1931, pp. 291. 


=—"so° 


2. The expectation of achievement on the part of 
the dull equal to that of average pupils of 
the same age. 


3- The inadequate provision of courses of study 
Which are definitely adapted to the needs of 
dull pupils or of bright pupils. 


4. The lack of proper segregation of pupils so 
that the first three features listed above, 
may be carried out in an efficient manner." (4) 


No one of the above defects is irremediable if the proper 
approach towards its solution is observed. 

Now the question arises: "How may pupil classification 
be effected so as to yield the maximum benefit?" A study 
of the evidence shows that there is no one method - that 
the number of methods advanced varies directly with the 
number of writers in the field. All evidence points, how- 
ever, to some general principles. Hildreth warns that: 
"In progressive schools pupil classification is never 
haphazard nor ae a Glassification, in other 
words, must be based on a carefully evolved plan. Another 
important caution that must be heeded is that the segrega- 
tion of pupils into homogeneous sections must not be based 
solely on one criterion. Alberty and Thayer emphasize the 
following: "It would appear that there is no royal road 
to homogeneous sectioning. The only safe plan is to consid- 


er, not a single measure, but all the evidence which is 


(4) Baker, Harry J., Characteristic Differences in Bright 


and Dull Pupils, Public School Publishing Co., 1927, pp.30. 


(5) Hildreth, Gertrude H., Psychological Service for School 
Problems, World Book Co., 1930, pp. 183. 


: -Genite Aoldzeang em 


isit- of? 83 ve tov ubtte’ ol 
fe 


4 . Fi : git ewede Ob LVe ed. 
m 


- Ag A Aastha. sbeddsar-to ido anich tote 
by J vs . vas 
ive » Lik  hblet® eat ab eretice to 20 ¢aes 


’ 7 
ae 


itz . | ; cionfuq. [avenge ons | ot (293 re 


2¢862 87 lae0sa0Tq. PP 


+580 


ohive sa? ig ted (omnécen" < g£anP 
~~ <tiy eo oe — + ve ee oe oe ee & a en ee eb ee eee ee, 


~. 


ae é 


= 
= 


| -igeee 4 eos TetItd oli siqefoatarg? job Urns op. 5 
Ou.aq 88h ,.00 amides test fosdeteblagds mnie ier 
SA Ne Pe oe 


~< 
—< 5 8 0 ‘es & 
ot =, Visa 


- — 7 a > mq ee ee 


available, and to provide fully for constant shifting 


from one section to another upon the discovery of apparent 

| Pitteishaltas’ |: is not intended that the above should 
excuse even a busy school staff from adopting measures to 
bring about adequate pupil classification. An effective 
Glassification of pupils will go far toward eliminating the 
tremendous educational waste represented by pupil' failure 
and repetition of courses. 

Some schools still group pupils for instructional 

purposes according to chronological age. From the viewpoint 


of many authorities, such a practice is ifidefensible. In 


many schools, there is no clear well-formulated plan under- 


) 

| 

) 

| 

| 

| 

| lying pupil classification. Van Wagenen says, "More and 

) more the principle of classification has given way to a 

) trial and error process of grade retention, promotion, and 
| acceleration, without any well defined ieee to replace 
the one in the process of being discarded." The present 

| writer contends that a carefully developed plan of pupil 

| Glassification should be adopted by the school faculty and 
then faithfully observed. 

| | One of the best lists of criteria for classification 


) has been prepared by Professor Hildreth. It must be 


remembered, however, that she is referring mainly to the 


(6) Alberty, H. R. and Thayer, V. T., Supervision in the 
Secondary School, D. C. Heath and Co., 1931, pp. 291. 


(7) Van Wagenen, M. J., Education Diagnosis and the 
Measurement of School Achievement, pp. 119, the 
Mac Millan Co., 1926. 


cs) 


P 
tag 
> 


aay 
YTsveoels en? nos “aenta 


£ tisolitiz cs eiaicn 
a c ae -\ = JOT ete = 

¢ i! $3 ws \4 Lone kw 
; ee: 3 oo ts, 647 


— _—— or oe ee ee oe ae ee it Athan ee 


Payee sison 


PEL sere SDOREYS es Sear aa 
var Ligh .* oe! t is g 12 


= 


peli 


Vey, Oi edd 


hate ; r+ trave fons 
iki ai t i228ai & '! Hg 
8 bramot tat ¢ Cite eityig: oO fol. at 
jet oteaw Fane oltsous c 6 
Saetres To eel 
2° xoF eficy boxun LLite efcoad 
: : uoLonetig, oF aadbto 
; ¢ tose , sett irod 


OvVOO Vis ise ee sermon 


/ > 


ad 


tot seoits 


+ + 1’ OUR Aa % 
. VT 2A de Dae LVEOCX. 


Ki traten.aci ate tana ee tshalleage: 
rad i « 


rage. oh Ee waedt bane gh: ay, 
00 Sas’ dteell aia: Coosioe 


7 


wateret oe cog hs’, vs ike 28 ti $ 


acre ; atid se 
f? aims’ ened lie 
ae Suki Se 


ee i eee ee ee 


elementary school. She says, "The possible bases for 
pupil classification are as numNerous as the characteristics 
of the children to ve classified.......c-e.eeeee5 Progress- 
ive educators consider the following criteria the most 
important for grade placement, grade sectioning, and 
promotion, although opinion varies concerning the weight 
to be accorded any one factor: 
1. The pupil's prodabdie rate of mental development 
2 Level of mental maturity 


Se Predicted progress in one or more of the tool 
subjects 


4. level of achievement reached in any one or more 
of these skills 


5. Chronological age 

6- Social and emotional maturity 

7. Physiological eae 

Symonds suggests a list of bases for homogeneous 

sectioning that would apply more directly to the high school. 
He says, "There are four bases possible for homogeneous 
sectioning: 

1. Present status in the suodject 

2 Present general ability 

5S. Predicted status in the subject 


(9) 
4. Rate of learning." 


(8) Hildreth, Gertrude H., Psychological Service for School 
Problems, World Book Co., 1930, pp. 185. 


(9) Symonds, Percival M., Measurement in Secondary Education, 
pp. 485, The MacMillan Co., 1950. 


‘ ha? . 
“AB hedxe oon, on 84 
Bi 


‘ wo 
r aq t 
: idcasdoiy 8 llgue eat sk Aw 
~ 
= ? ioe- >a - Tt ¥ 
s Sv 4s Ol , £6 7Oul * 
¥ 


fooe : > 2% e no oi sueuneta bedol best ek 


7 Of ‘ ah @ 


or $n : P eo vaws ai - >» > 
: . eh’ @ ~~ aw Oe ’* 7 


bs, ‘ i 2 
GCiicsi ne Slseny ree < 


omer Lesisolonesdd 


r r 
. +* ; 45% ae a 
> ’ 
ri r rc r ry > £ sy 
ia , et ee as ‘ c 
> Trad - ; - 
yw © ~ < 
; ' 2 
7 
j 
/ 
r 
é . - na 
‘ ‘ ne “ —_ 
<p * 
ae .SIeies 
_ toa é ao 
. “ ] 4 Fa Ls DS ree 5 
ge % 
i se ‘ 
- 
- oe > 
- str sel . 
- 
ee mete ——- Se ae ee eS Na Ae 
<= .-| . 
tvrott URE. phases”, tom 
iw ¢ iwen meals gril 
‘ws * rs ~~ «- S (esoic sw i + tee obnstx8 ae 
5. re 
1 rms . a ~ 
wv 4 ) < = "we ir 
“— 2% ¢ Whaat. ewe Gre x2 Tn 
a Fe oe q ~— , ‘ 
= geval & 3 é aS t 


In sectioning on the basis of present status, the scores 


on an achievement test could be used. For the second, 
scores on either one or two intelligence tests could be 
taken. For the third, a prognosis test will have to be 
used. For the fourth, the rate or speed with which progress 
will be made in the subject, a comparison of the scores on 
Successive achievement tests will have to be studied. 

Let us consider at this point the advantages claimed 
for the practice of ability grouping. Probably the most 
potent argument that can be adduced is that homogeneous 
grouping makes for more efficient pupil’ learning. Odell 
says, "Most of the arguments in favor of homogeneous group- 
ing may be united into one, that it makes for more efficient 
learning and, as a result, better achievement, on the part 
of the pupils. In most cases the basis for this argument 
is theoretical and not EES very important 
factor is that it makes the work of the teacher easier. If 
the Senahér has pupils of approximately the same abilities 
in his class, he can plan his instruction to meet the level 
of the group. With a heterogeneous class, the teacher must 
prepare to meet the levels of the different elements, and 
his effort is diffused to such an extent that the presenta- 


tion is ineffective. 


(10) Odell, C. W., Educational Measurement in High School, 
pp. 502, The Century Co., 1930. 


fh ent.s 


bee J of 


3 : fe zrt ~ sae 


lies 7 . & C3i2. UWE 


~ _* 
> P t21: oad 
’ <3) 
: 4 Frogs 
¢? wy ei I 
‘4 bs * 
4 Py i 
. 
p Se i , OG ue -~ 
: i Cer el a 
7‘ 4 
— =< — 
1 ; : ; ) » SHEsT 4-881, 508 suloreel: 
rode 
a ~ FeO 35 4qea fa Re 


soltenx onde a) ea 
mee A 


1 = 


174 


Homogeneous sectioning has been attacked from a 
number of angles. Bagley has been one of its strongest 
critics, and Terman has been one of its most staunch 
defenders. Bagley's theory of educational determinism has 
been frequently cited. Symonds calls attention to it in 
these words: “Segregation on the basis of ability means 
differential educational treatment and the very act of 


placing a pupil in one or another Bagh rer eeeat for a 
11 


fatalism or determinism in his education." This theory 


holds that the placing of a pupil in a "slow" section 
brands him as a dullard and forever dooms him to the 
limited opportunities existing in his group. The force 

of the above indictment is diminished if the shifting of 
pupils from one group to another when they show improvement 
is provided for. 

Great care must be taken in sectioning to avoid the 
use of such terms as "dull group". The pupils in the 
different sections should not be made to realize that they 
have been put in the slow group because of their low 
attainment. ‘The trouble is, however, that pupils in the 
secondary school will soon come to sense what has happened 
even if they do not understand the full details. It is 
difficult, nay impossible, in most schools to make an 


administrative change and not have them get at the reason 


(11) Symonds, Percival M., Measurement in Secondary 
Education, pp. 477, the MacMillan Co., 1930. 


ihe ove tothe. qiauess 


— sine sah mL" a \ 
s GO.I téesenehe ist LOW 


i 


* A 
4 bye 
. 
s “ey na aT ~ ft 7 ~~ 
- i » KAZ shih of ee ; 4 ae ep 7° Ts av 


pt Mea 
eu 6 pofnos tec Pi ?Sivleec Pures % an sn a 
noasex ead te Tex asav-evad Jou Shea @ yonas ‘ov eo at 


i Pig | Pr 
“a oe ——— oth « o— oP ee oe ee ee ge Ce we wo ee eee a x 
t es ra 


sooner or later. Nevertheless, the attention of the class 
should never be focussed upon the fact that they are the 
dull group. Brewer suggests the following: "Classification 
or sectioning by ability should proceed with great caution 
and must always be tentative; furthermore, particular care 
should be taken not to use such expressions as ‘low level’, 
‘dull pupils', ‘inferior children', and ‘low group'. 
Neither should sectioning prevent the free contact of all 
kinds of children with each other, at least in student 
activities, athletics, auditorium exercises, music, and the 
like, which should be used to integrate the student body 
in preparation for later life activities, where of course 
persons of a variety of intellectual levels are ee ee 
Ability grouping has often been criticized as being 
undemocratic. it is argued that democratic conditions do 
not prevail in the classroom if any attempt has been made 
to segregate pupils according to their relative mental 
abilities. Another objection occasionally voiced is that 
the dull need the stimulus of the bright. The objection 
that gifted pupils will tend to overwork when grouped 
together is frequently cited. Besides all these, it is 
said, at times, that homogeneous sectioning is not the best 
policy because dull pupils learn a great deal from the 


recitations of the bright pupils. 


oe oe Se ee SP ee ee ee Se ee ee ee De ee Se ee Ge ee ee ee ee ee ee ee Se ee eS ee Se ee oe 


(12) Brewer, John M., Education as Guidance, The MacMillan 
Go., 1932, pp. 581. 


ietig sit 


A , « a> + + + ~ ae 

fs ~ rs yi 

4 é 2 tee % “Ae - - £7 =F 

4 P a f Ty? - eS are 

> > . ™, oP. 4 - ar | 
c U 79 O00O74gd. @ 
. . * : on > 
=) i 4 ‘ wa 
- a ——— 4 
4 iL 8° Ff ; : 
: Ae : “ 
© — - . a 

‘ 
be ys 


. 


-s GOS. 


: , ‘ 
2 , ass 
: par. Y= i ee ye ae eo 
, —s : , <5 5 SO 202487 TSTes ida 
J ; 3 
5 
. ’ rE A 

: t mm 2 = ie 
: - a F - a 4 . ,8O0i/ @2A0Ge ~HhBis 
e =f A i 


aed 7 
at a) 
an al : ' 
PD Mees > 
, ( 


as Tt Dis $s) Biupevai Of OSes eG binods dot 


Con Mes 


4 
7 r 
pa } ¢ ‘ 
; a . ~ ¥ 
- 
: “ 
€ = . iS 
* 
7 Te . ‘ - % ‘ 
- + Aa sa . 4 
- * 
a“ r 7 + A ey 6 
“ =~ Ke ~ 
a | 
te - ba | eo 
’ . 3 « e 


7 
> c ne ¢ 
r - \ 
. 
aT 
: . f 
e . +} J 
‘a* 
>) wk. 
- 
7 i e J bo ee oP . 
‘4 
4 , ~ - P 
hp i ‘ SP de - . - ee es - 
7 
- Fr > i ~~ ree 
pas 3 $ 7 - ~~ 4 
~ 
‘ * 
s€ oo he Sh \ 
es | - ~~ Cau « FO eh ee ee oe op ee eee 
! ; . " 
. wm FF 7 4 4 
; ? Lider 4. oo) erie - 
s 


175a 


K. THe GROUP INTELLIGENCE TEST AS AN AID TO THE COMMERCIAL 
THACHER 
I, Development of the Group Test 

Mental testing is of rather recent origin even though 
its origins may be traced back to a distant past. Harlier 
in this study (pages 16 to 20 inclusive), a short historical 
development of the intelligence test was presented. The 
epochal work of the French genius, Alfred Binet, gave a tre- 
mendous momentum to the intelligence-testing movement. His 
work is of great importance historically because so much of 
the later development of intelligence testing can be traced 
to ideas that he originated. Shortly after 1900, Binet and 
co-worker, T. Simon, set to work in connection with the 
public school system of Paris on the problem of picking out 
those children who were likely to fail to profit from their 
school work. As a result of their studies, the famous 
Binet-Simon scale consisting of thirty tasks or exercises 
Was published in 1905. These tasks were arranged in order 
of increasing difficulty, but were not grouped according to 
mental age. 

After further experimentation during the next few years, 
Binet revised the original scale in 1908. The important 
point about this revision to note is that the tests are now 
grouped according to their appropriate ages. In 1911, the 
year of his death, his last important article on mental 


tests appeared, and it contained a further revision of his 


. i Soe 
é . quoas ‘wae ts eB ago fe 


Fol as sc 


6 HLglIO Tnsaet Sentay Fe et ‘geliee: 


4 i - i 889 @ @uce 
7 - ‘ 
- > VU ek v= £ 
C 
ad a aw oe Sixe a) 


ee) 

es IG os sya food os: etl oC 
NaN TE 
on yk 
ae yor) 


wo ot Tie? Yi S10 ie .a8% idasens eeoth 


vs 


sf lL Be. tH OT eat exsdt ~~, cOel als bodelting 

, fear & 

. ; Pa gic: 

’ Pt phe > 4 E “ vy I< : ieee: saluearte = 
: Pay bell 


ora", nd is 
ome J aad a & 


ase roqmi iq Cog) 4 86isos faafaiszo ond — 
“GON sts 22 { pt? saa? el efon of nolaty rer anit t20dg. 


— as . j i oe el : a ae 


seale. This revision differed from the 1908 Scale in 
arrangement of tests and in the allotment of tests to each 
age. Some new tests were included and some of the old were 
dropped with the result that the revised scale comprised 
fifty-four tests in all. It is evident that Binet's work 
can be traced to his interest in abnormal psychology. The 
practical sociological problem of how best to help various 
defective and delinquent classes was of great interest to 
him. Pintner says, "If, in the history of psychology, we 
call Wundt the father of experimental psychology, we must 
then call Binet the father of intelligence inane 

In the beginning, the mental testing movement in America 
was based upon the beliefs of “functional psychology". The 
work of Cattell with the Columbia University freshmen in 1890 
illustrates this point. Ragsdale says, "Cattell used essent- 
ially the assumptions of the functional psychology which 
were to-be explicitly formulated only a decade later. This 
school of psychological thought believes that mind can be 
understood as being composed of a large number of functions 
Or ways of si scutes It was logical under these conditions 
for psychologists to attempt to measure each of the mental 


functions separately. When Binet began his work, he accepted 


the psychological assumptions in vogue, viz., that there 


(1) Pintner, Rudolph, Intelligence Testing: Methods and 
Results, pp. 32, Henry Holt & Co., 1931. 


(2) Ragsdale, Clarence E., Modern Psychologies and 
Education, pp. 215, The Macmillan Co., 1932. 


rn 


- ar iet ¥ © i {on 


re * t~ P 
« : a ‘ = i ‘ 
; . ee UEe } Ps f 
fe oo ~ og Whacins 
. »! ’ 
, 1 ¢ = 
J L  #) x 
i 
il ‘ ’ i 
wy 5 , 
‘ “ - ‘ . A 
a as \ 
. ‘ »* 
b A, 
f * 
{ o . 
v 
.. 
. 3 aR AS 
) j ray vy a . : | . 
s . - A aA 4 te ’ i - . é ~ © ‘ 
ad ~c 


Zi — ss — , " De 5 nad ay 2 Beats 
3% ‘ 4 ty, é ABW av TARA o piu 2 Gu ix a Ww 
= = i ‘ t pan 4 “7 | we > 4 ‘geretg ae OY 2 eed) Teen 
: = f yy f 
. 77 5 - 7) A ; ~’ . ‘ A . ns He AMT BLS he'd Yo le te Lex 
. , ; a 
» . 
: : ‘ oe 
, i te ee 
, 
+ - . “ 4 é 
e ‘ “eh v, 
\ ¢ q chi t.O0 
- > 4 
Ley, 
- . _ * t ) f 
= 4 4 : 
5 4 s a LEE money é 4 wpatad, 


: _«) Satan Oo toee exmvesca..oF a FB ON ovelgosoi 


: Bea xe 
So. ; 
- » { - 
> cx 
? ee s - Beso. Bau te bee | 
- te 6 a. bh -* » -~ errs 7 ~ Beta Fe © foe 
etSny J 22¢ yay, SOSeyr-.. a: RO 6 deus ate 
a P anes et amet et - 
: , nian te ed eee oe ee ee ee , 
é + > rw +e 4 a ee oy #7" ONG s me 
} Lae - 9 a - rs a a 
AS i ‘ 
* “) oie o 4 \ 
» | t 4 
oo 


Foe ir ee ST ee 5 ie a Se ot is wales Ee ere f 
ee ‘Pag Oe is & Leo gng IYSs bakin, 


were many mental functions, each of which it was desirable 


to teste In addition, however, he made an assumption which 
radically changed the character of the tests. Ragsdale 
says, “Binet assumed that mental functions of all kinds 
develope at approximately the same rate. This assumption 
of Binet's made it unnecessary to be greatly concerned about 
just which mental function was being measured by any given 
RA 

Even before the publication of the Binet-Simon Scale 
Dr. Lewis M. Terman of Stanford University had been working 
on the problem of individual differences among school 
children, and when Goddard brought out the first American 
publication of the Binet tests in 1908, Terman seems to have 
become interested in Binet's method. During the years 1910 
and 1911 Terman and Childs tentatively revised the Binet 
1908 Seale; this revision was published in 1912. Pintner 
holds that: “Terman considered this merely a tentative 
revision, because his experience with the Scale so far had 
shown him the great possibilities in the way SE further 
extension and more complete standardization." During the 
next five years Terman and his co-workers occupied them- 
selves with the revision, extension, and standardization of 
the Binet Scale, and their final results certainly justified 


this expenditure of time. 


(3) Ibid, pp. 217. 


(4) Pintner, Rudolph, Intelligence Testing: Methods and 
Results, pp. 40, Henry Holt & Co., 1931. 


= ee 
Be ae ee pa, + 


et ee te pat ts lt ty a “Tits - 77 <7 


tare ery. . iat aor eptifatnat qieketl 


[stands onf heasado gs 


PT tg 
tec? Sempees T5akdMog 


' 


> * 


; uss ed? eletemigoudaa ges beset ab 
vresesoeans Fb ebam 


: . | olling sdfd-9a0 ted news 

bad ¥ilatevin 5 ahs > to, ngorto? al eed 
i7fbci. te meieeay ale 

ex! tsdh6D opde wae 
m4 128 odd Zo sontank lon 

: tou «tens al setaortetal Om 

5 jot a$li89 Se sAate? Li@h 

soleivyex ig? a a 

oo sare aid @bh 

; > eid ohunoed -nokedyi 

etiiild 24 20949 ade Mae 

mi silntzbineis. siafignes on bra sina 
ed qe ano: 94 E- ic #08. pemgte? area +O: a 


efteo etiesen L20}} Lest hem oicok sam 
ni 


a» 


.ogtt So eamebbae ~qxe. 
7 are 
Je Soe 
vi 


BOL 208 * 3 ‘fok yan , Os “at 


The Stanford Revision does not contribute anything that 


is essentially new to Binet's ideas. Binet originated the 
method; Terman worked to perfect it. A more complete 
standardization of the tests was effected. In his work, 
Terman found the need of a new statistical term in order to 
properly express his findings. This new term was the 
Intelligence Quotient (I.Q.) that was later to become so 
important. It is noteworthy, however, that this concept was 
not original to Terman, yet its wide use subsequently can be 
directly traced to his adoption of it. Kelley says, "Stern 
in 1912 was the first to use in print the term ‘mental 
quotient’, meaning thereby the mental age divided by the 
chronological age. Kuhlman independently, in the spring of 
1912, hit upon the same devise, and published a little later. 
The concept here discussed is now the familiar Intelligence 
Quotient. Terman has adopted the term and investigated the 
concept. As a result of these studies it appears that one's 
intelligence quotient is, at least to quite a marked degree, 
constant through-out eee "The Stanford Revision by 
Terman in 1916 is the best known and is today the standard 
instrument for individual moe 

Since 1915 there has been an important shift in psy- 


chological theory from the functional to the behavioristic 


<= oe oe ee ee Se Ge em oe Se ee Se ee Oe ee Oe ee ee ee ee ee ee ee eee ee Gs es ee oe 


(5) Kelley, Truman Lee, Interpretation of Educational 
Measurements, pp. 5, World Book Co., 1927. 


(6) Monroe, Walter S., DeVoss, James C., and Reagan, George 
W., Educational Psychology, pp. 266, Doubleday Doran 
& CUc., inc., 1950. 


175e 


psychology. The behavioristic or objective psychology is 
not interested in attempting to determine the strength of 
any individual function or capacity, but instead attempts 
to obtain samples of behavior in the hope that from these 
samples may be inferred the child's intelligence. Ragsdale 
says, "We are taking small samples of behavior in the hope 
that by using them we shall be able to estimate the child's 
present behavior status and make predictions concerning the 
kind of behavior which he may be expected to show in the 
Decca 

The present-day plan of measuring intelligence is to 
request the child to give an observable performance and then 
to infer his intelligence from the obtained results. It 
must be emphasized that we do not measure intelligence 
directly. That we cannot measure intelligence with complete 
accuracy may be deduced from the following statement: 
"Strictly speaking, we do not measure intelligence. We 
measure certain achievements, and from the results obtained 
we infer the status of the child's PTR PRP 

The Binet-Simon Scale and the Stanford Revision of it 
are both individual tests and can be given to only one pupil 
at a time. From the standpoint of practical school use, 
there is one great disadvantage in the individual intelli- 


gence test. Expressed in the words of Adams and Taylor: 


(7) Ragsdale, Clarence E., Modern Psychologies and Hduca- 
tion, pp. 222, The MacMillan Co., 1932. 


(8) Monroe, Walter S., DeVoss, James C., and Reagan, George 


W., Educational Pgycholo pp. 265, Doubleday Doran & 
ee tee. 298m = ; : 


"In all of these tests, it was soon recognized that they 
(9) 
were very costly of time." The time element is a very 


important factor if thousands of pupils are to be tested. 

Psychologists were hesitant to accept the results of 
group mental tests at the beginning because they felt that 
these instruments were very inaccurate in comparison with 
the individual tests. The group test was slow in arriving 
and in establishing itself as a legitimate method for the 
measurement of mental ability. Pintner says, "The early 
attitude of psychologists towards group tests was decidedly 
Pataaiee: 

There are several important differences between a 
group test and an individual test. These differences may 


be grouped under the following headings: 


1. Differences in the number of individuals 
measured at the same time, and 


2. Differences in the method of testing. 
It is logical to assume that the individual test would 
yield the more accurate results because the examiner merely 
deals with one individual child. When an individual test 
is being administered in the psychological clinic, the 
examiner is very careful to try to win. the confidence of 


the child. If the child is antagonistic, fatigued, or 


(9) Adams, Jesse E., and Taylor, William S., An Intro- 
duction to Education and the Teaching Process, 
pp. 172, The MacMillan Co., 19352. 


(10) Pintner, Rudolph, Intelligence Testing: Methods anda 
Results, pp. 180, Henry Holt & Co., 1931. 


badly frightened, it is customary for the school psycholo- 


gist to postpone the testing of this pupil until more 
favorable testing conditions are obtainable. Pintner calls 
attention to this difference between the group test and the 

individual test in the following words: "The group test, 
therefore, is not as pure a measure of intelligence as the 
individual test. The group test contains in its score not 
only a measure of the intelligence of the individual, but 
also a measure of his willingness to cooperate and put 
forth his best con ie and Marks say, "The group 
test places the examiner in the attitude of the physician 
who administers an anaesthetic without the precaution of 
keeping a firm hand on the patients pulse: the examiner 
is denied the corroborative evidence of imaginative 

(12) 

insight." 

The gradual evolution of the group intelligence test 
indicates clearly that each new development represented the 
work of some practical psychologist who was faced with a 
problem. As tests for different mental processes were 
multiplied, the group method of testing became popular. 

The transition from the single group test to a series of 


group tests, the results of which could be combined into an 


intelligence rating, was the logical result of further 


(11) Ibid, pp. 183 


(12) Levine, Albert J., and Marks, Louis, Testing 
Intelligence and Achievement, pp. 161, The 
MacMillan Co., 1928. 


175h 


experimentation in group testing. Pintner says, "Thorndike 
Was among the first to see the advantages of this method 
and he must certainly be considered the leader in this 
ee 

It is now accepted educational procedure to interpret 
achievement test results in the light of intelligence test 
scores. The progressive teacher avails himself of the 
group intelligence test in order to find out why a certain 
class should get such low marks in their sudject-matter 
‘tests. If he finds that they score rather low on the group 
intelligence test, it is a pretty good index that his 
instructional methods will have to be varied, and that a 
regular program of diagnostic testing and remedial teaching 
must be faithfully carried out. Buckingham says, "It is 
a curious fact that although many persons felt the in- 
sufficiency of the subject-matter tests, few seemed at first 
to realize that what we most needed in order to make our 
test scores of real worth was intelligence scores to place 
beside them. Until the advent of the group intelligence 
test, this was practically Tank ORepa 

We have previously noted how the group test idea was 
received at the beginning with skepticism, and even actual 


hostility, by many psychologists. Today group tests are 


cae eB eS eee SE Se SE Ke Be OS SES SOS SK SE Ke SE EO EE ESE SK KK SE SE SE SE SE SE Se SE SE SK Se ee Ow ee 


(13) Pintner, Rudolph, Intelligence Testing: Methods ana 
' Results, pp. 181, Henry Holt & Co., 1931. 


(14) Buckingham, Burdette Ross, Research for Teachers, 
pp. 140, Silver, Burdette & Co., 1926. 


an fae 
it oer con ait a 
= te » TY 
olrete eveh iow: 

d eee, ee yey 
tiaiged: 


2g “roar RY a 2 


a: ew sacha haleneeehates 


m 
eur Lane oldog 
4 PRG y 


ne 30 


<§ 


ty eas 
y OLS 


= 


co-eds 


Ve I 
© te 


Fig 


Sa 


being used very widely and have the sanction of all the 


psychologists. In the early days of group testing, an 

event occurred which supplied a tremendous impetus to the 

group test movement. The entrance of the United States 

into the World War necessitated the building of a great 

army. In order to do this, it soon became apparent that 

some system of group mental testing would have to be used 

in order to determine the intelligence levels of the indi- 

vidual soldiers. It is logical that the men possessing a 

high order of intelligence would have the greatest likeli- 

hood of succeeding as officers. On the other hand, the 

problem of what to do with prospective soldiers having 

inferior intelligence must be considered. Later it was 

found that many of these individuals could be assigned to 

labor battalions; many others, however, were discharged 

from the army because of mental defects. The successful 

use by the United States Army of group psychological tests 

hastened the improvement of this type of test instrument. 
Professor Pintner gives the following summary of facts 

about the Army Tests: “The work in the army extended from 

September, 1917, to January, 1919. Psychological testing 

Was established in thirty-five camps and altogether 1,726,966 

men were tested either by means of group or individual tests. 

This total includes 42,000 commissioned officers. Individual 


examinations to the number of 82,500 were given. The 


175j 


psychologists recommended 7,800 for discharge for mental 
defect, or 0.5 per cent of the total examined. They 
recommended 10,014 or 0.6 per cent for labor battalions 
because of low intelligence, and 9,487 or 0.6 per cent for 
assignment to development battalions for training and 
observation for possible use in the eed 

The tests used consisted of two group tests, viz.: the 
Alpha, a group test for literates, and the Beta, a group 
test for illiterates and foreigners. Besides these, indi- 
vidual tests such as the Stanford and the Point Seale and 
Performance tests were used. To Dr. Arthur S. Otis of 
Stanford University goes great credit for the development of 
the Army Tests. 

According to Adams and Taylor, there are now between 
thirty and forty group tests that are rather widely Pt 
It has been shown, now, that the group intelligence test 
has experienced a great development. Should the reader 
assume, then, that the individual intelligence test has out- 
lived its usefulness? The answer is most emphatically, 
"Not" At present the individual-type test is used to 
Supplement the group test. Both types of instruments are 


valuable aids to the school psychologist. Adams and Taylor 


say, "The wide use of the group test does not mean that the 


(15) Pintner, Rudolph, Intelligence Testing: Methods and 
Results, pp. 318, Henry Holt & Co., 1931. 


(16) Adams, Jesse E., and Taylor, William S., An Intro- 
duction to Education and the Teaching Process, The 
MacMillan Co., 1932. 


‘ih 


(es! yy 


ae nolievagede 


= ese 
“ + t ~u 
OTs 
k 4 [s 
Pa 
‘ ; '§ 
, + . el Mel 
e 
' 
i v 
v 
2 
‘ e 
: ad 
" 3 - Mtn 
a 
- ts 
be 
oh 
Pe 
‘ Ps ae 
. 
— ss 
| 
se 
ouet 
ta > Cane 
Zz 
™ ‘ 
e i 
a» - > 
: 4 


ot =F Ra one &@ 


= ~~ ow ee ee 


(zhAnt. Aoletsd ; seen 
pWeih<. 6 re at aaa R 


175k 


individual test has been discarded. The individual tests 
are also widely used, particularly for a more refined 
measurement or as a further test when there is a doubt 
about the results obtained from an individual who has taken 
the group ee 
II. Common Types of Material in Group Tests 
At this time it would be logical to make a brief 
comparison between the materials suitable for the group 
intelligence test and those adaptable for use in achievement 
‘tests. Professor Dearborn makes the following distinction: 
"The school examination requires a special bit of knowledge 
which has usually been recently acquired; the intelligence 
test tests the use of old and fairly common knowledge often 
in a new or somewhat unusual Bia a materials included 
in the group psychological test should involve fairly 
common experiences rather than special learning. [In this 
respect, the mental test resembles the puzzle or riddle, 
since the latter does not call for special learning, but 
rather, ingenuity in the using of ordinary life experiences. 
There are two distinctive features of a good test, viz.: 
1. It utilizes fairly common experiences rather than 
special learning; it calls for ingenuity in the 
attacking of problems yet unsolved by the 
individual. 
2. The test requires a sampling or averaging of the 
individual's abilities. 
(17). SbsAg-pp. 172. 
(18) Dearborn, Walter Fenno, Intelligence Tests: Their 


Significance for School and Society, pp. 55, 
Houghton-Mifflin Co., 1928, 


$do atineery oft raade 


. peg : 
mS 
whit 
» ae 
. 
4 A d 
oa 7 “VG 
x 
hy AS 


. 
iw < 
- 

vy 


a ee 


Be Goethe Fs NaN 
i 


- : Ad ecnte 


- 


. Lacs as 


There are valid reasons why the group intelligence test 


should contain a large number of items. Wheeler and Perkins 
Summarize these as follows: 


"]. With individuals raised in different environments 
and having different life-interests, a similar 
score in the test will not mean the same for one 
person as for another unless the items are 
sufficiently numerous to give each person an 
equal chance. 


2 There must be a sufficient number of items graded 
in difficulty to differentiate those persons who 
can comprehend only the simpler relationships 
from those who are able to grasp more complex 
relationships; there must be an appropriate 
number of items not too hard for dull individuals 
and not too easy for individuals who are brilliant. 


3. A wide range of facts and relationships must be 
covered to avoid making a test of specialized 
interests and aptitudes. 

4. The items thus varied and graduated must yield 
results that can be expressed in terms of numbers, 
and these numbers must represent the relative 
position of the individual in the group."(19) 

The most common types of material in group tests will 
now be presented: 
a. Opposites - The subject is called upon to write down or 
indicate the opposite of a given word, or to decide whether 
two words denote similar or dissimilar ideas. 
Example - Underline the word in parenthesis which is the 


Opposite of the first word: 


accept.....(receive, percept, deny, reject, spend) 


(19) Wheeler, Raymond Holder, and Perkins, Francis 
Theodore, Principles of Mental Development, pp. 177, 
Thomas Y. Crowell Co., 1932. 


: iLixtet : . if Om 2 L 
: , + FA we ony ae »TOL 
; ao i ’ in “oe f an? ee 
; 3 Lu iE SO @eigss ee. § et Obe acl 
_ & - FF is * 
BL 9 2 OU [Lewonxd ‘Te Fsders & 
> : = pe she - é 


be. Analogies - An analogy between a pair of facts is given 


and the subject is called upon to draw a similar analogy in 
reference to another pair. 
Example - Underline the best of the four words in parenthesis: 
cellar:attic bottom (well, tub, top, house) 
Ge. Best Reasons - The subject is required to indicate in 
some form or other the best answer to a question. 
Example - Check the best reason: 
Why are criminals locked up? 
1. To protect society 
2 To get even with them 
Se To make them work 
de Disarranged Sentences - A sentence is given in which the 
words are disarranged and the subject has to arrange them 
properly. 
Example - Cross out the superfluous word in the disarranged 
sentences 
watch summer the man stole is jail who the in. 
‘@. Proverbs - The subject has to match proverbs having the 
same meaning, or decide whether they are the same or difier- 
ent in meaning, or match them with statements that are 
identical in meaning. An example of this would consist of 
a number of proverbs to be matched with statemenis that 
explain their meaning. 
f. Number Completion - The subject is required to determine 


the rule or method in a series of numbers and indicate this 


= a 7 
i” ie ic aa 


SI kid an: vee 
ite 


29 e 


eS ,™ 
: Sy 
Tae ves had, rey 
» 


} Lg Ja e Ts. 


in some way. 


Example - Write down the two numbers that should come next: 
Bint 6 9 be 18 eee. Sor 
g- Directions - The subject is asked to follow specific 
instructions. 
Example - Cross out the “"g" in tiger. 
h. Sentence Completion =- The subject is to fill in 
omitted words in a sentence or passage. 
Example - Write one word on each blank: 
The boy.....two dollars to the Red Cross. 
i. . Information - The subject is required to use his general 
information over a wide field. 
Example - Underline the correct word: 
Euchre is played with dice, rackets, cards, pins. 


The Delco System is used in plumbing, filing, ignition, 
and cataloguing. 


j- Arithmetical Problems - The subject is required to test 
his ability on reasoning questions in arithmetic. 

ke Word Knowledge - The subject is required to give the 
meaning of single words or words in sentences. 

Example - Underline the word that means the same or nearly 
the same: 


kind - (1) open, (2) fall, (3) good, (4) not far, 
(5) new. 


1. Classification, Generalization - The subject is required 


to classify, generalize, or make a logical selection. There 


i be + 


ext mnoe “a os olga 
“sy ede! ; - . 


pte Wee 


wat 


3 UO * 
eres 
ao ht ey 
> = > 
L & 
- +.) 
7 & 
sre 


“ 2 “¢ 
zs 18 ’ 

* 

al 
4 S 
clfts 


wolq ed OOF oe 


steve -onied sty 
nigaolats 
a4: ped A : <a 


? *y 
‘ « 
- 673 64 


——_ 


are many tests of this type. 


Example - Draw a line under the two words which tell what 
the thing always has. 


A circle always has: altitude, circumference, lati- 
tude, longitude, radius. 


me. Won-Verbal Material - There is a duplication in non- 
verbal material of almost all of the verbal types of 
materials. For obvious reasons no examples will be given 
here. | 
III. Limitations of Our Present Intelligence Tests: 
Validity and Accuracy. 
A. Validity 

Before a test can be accepted as a "good" test, certain 
requirements as to its validity and reliability must be 
fulfilled. These two topics have been treated in great 
detail elsewhere in this thesis. (pp. 39-72 inc.) It will 
be necessary at this point, however, to supplement what has 
already been given since the latter applies mainly to 
educational tests. Much of what has already been given will 
apply equally well to either subject-matter tests or intell- 
igence tests. 

Monroe, DeVoss, and Reagan say, "Since it is extremely 
unlikely that any one of our present intelligence examina- 
tions is a 'perfect' instrtmeni, peg eee all fail to 


yield valid and accurate measures.” 


(20) Monroe, Walter S., DeVoss, James C., and Reagan, 
George W., Hducational Psychology, pp. 281, Doubleday 
Doran & Co., Inc., 1930. 


* 


Lsitevai J tedza¥ at 


: ~ 
a Bi * 
a - * ak ww 
- wesla 
~~ ’ A 
- iu ge 4 
- ai & 


Jeed e oxoted | 
- * 


w ot ~-a’ 
- 4 a 


Lt .@uamegavend *i oe treg! 4: nee 


* AaTeBen 92 BIN DOB. bin ch 


aud ob isi naa. he alah - 


175p 


If this statement is true, it is safe to say that there are 
no group mental tests that are one hundred per cent valid 
and reliable. Let us consider at this time the following 
questions: 
1. How nearly do the results yielded from the test 
agree With the results yielded by true measures 


of intelligence? 


2. How accurately does a given test measure What it 
measures? 


The determination of the validity of an intelligence 
test would be simple if we possessed some means of securing 
true measures of intelligence. Because such true measures 
are not available, test-makers have used a variety of 
approximations. A careful application of the Stanford 
Revision of the Binet Test is considered one of the most 
widely used criterion measures. 

Because the validity of a test must be determined by 
means of some outside criterion of intelligence, we shall 
now consider some of the common criteria employed. 

1. Chronological Age 

This criteria was employed by Binet in his early work. 
According to this criterion, a test of intelligence should 
be passed by increasing percentages of children as we go 
from the lower to the higher grades. Pintner says, "This 
criterion is of limited value and is not commonly used by 


psychologists today, although it is useful in helping to 


ey a al a. 
ee To on _ - ow " 
YA Laer wood) ch 

o 


Set * a Set: eo iF 647 


. : & 
\ oly Se aid . 
fa cis 
Cl 
4 ' ~ ay 
2 roves vou 


: . : ithe’ Segond 
ws rah ‘ 
ae 

in 
: x os ~ & > eo 
iF: . b5 {7 re e fo j an ‘Gece: og ‘bis 
» * 
. 
: = 
< 
r 
. 
aA 
<a 
+ 
si : f J 
[ 
Poel - 
- a ' a, 
! 
wet pd 
’ ' 


determine the relative discriminating values of a series 
(21 


of tests.” 

2e Known Groups 

This criterion was also used by Binet.in his early 
work. If we have two groups of known intelligence as for 
example, a feebleminded group and a normal one, it is 
Obvious that a test administered to them should result in 
much higher scores for the better group. If there is little 
relative difference between the results obtained, then the 
‘test cannot be a good measure of intelligence. In the same 
Way, the scores on a group intelligence test for a superior 
class could be compared with those from a normal group. 
Pintner says, "As a first rough measure of the goodness or 
badness of a test for intelligence testing purposes, it has 
proved od 

3. Teachers’ Judgments 

The judgment of teachers as to the intelligence of their 
pupils is frequently used as a criterion of the validity of 
atest. The theory behind this is that the teachers are in 
a most favorable position to judge the relative levels of 
intelligence among their pupils and consequently, the 
results from the intelligence test ought to correlate some- 
What with their judgment. The results from this criterion 
must not be accepted too readily. The judgments of all 
human beings are fallible. Pintner holds that: 


(21) Pintner, Rudolph, Intelligence Testing: Methods and 
Results, pp. 105, Henry Holt & Co., 1931. 


(22) Ibsd, pp. 106. 


ery 4. 
‘eqeor: avone 


: ae 


a ony eéia sew rolbtetizs ® ging 


A - ~ 
7 < A ‘ Ho, fi aes : 
’ k ih. PSEA LOSRL B 
+ - >? os Bay F Ps. + . * 
: i ; (24alpos viSv’ £8 
$ 7 7 ’ . - -P 
. . * 
€ 4 . | = 3 rv] & 8i.Coe 
- ae 7 
ge 
: , OfvVUG SORS TSS 
‘ ait it fis ¢ f . > oc 2 8: 
cy 
' ff 8 a0 sexces oat vad 
« 
% - Ve 
; : ¢ ti 
«¢ " 
i oe 
+ TA 
é ~ . oy ~ a 
Dal 
J 
. 
. K 
: 
, 
7 
: ' sof Oats 
(__« f« i 1 
a? Sa eS y ae > v. . - vue 
r , r 
, ray ari seco i ¢ , 
403 sO 8k ee Se Se ee - 
a ~ ah th owe ee ne ee ee er ee ee ee | 


a7 ol i?! ipiivee? -ganpsdiitetal «io lopee 
lect ,.2d @- 3408 Viner .d0L 


"If we are constructing a new test and find that it correlated 


about .3 and .6 on the average with teachers’ judgments, we 
should be satisfied so far as this criterion of validity is 
bach susa ince 

4. School Achievement 

The use of school achievement in validating a group 
psychological test may be done by using any one of three 
things, viz., school marks allotted by teachers, scores on 
standard educational tests, or by the rate of progress through 
the grades. Whenever we take any of these ratings, we assume 
that the intelligent child will work more or less up to his 
Gapacity in school work. In regard to rate of progress, we 
assume further that he will be allowed to progress through 
the grades at a rate commensurate with his intelligence. 
Pintner says, “The use of educational tests as measures of 
validity for intelligence tests is, therefore, of limited 
value. An intelligence test should correlate fairly well 
with educational achievement, but we cannot use an education- 
al achievement test as the sole te EO oh 

5. Other Tests 

Another-validation method is to correlate with a 
known test of intelligence. If the Stanford-Binet is 
accepted as a valid intelligence test, it is evident that 


the results obtained from it ought to correlate positively 


(23) Ibid, pp. 107. 
(24) Ibid, pp. 110. 


4 P : me rt 
- = ie oe i An : 
ee~o: t* sand 3 tée0t woh Bw senncence'e a) 
‘ . s ah a * a, ; a > 
; ' euonerad atle ege weve: adds ge alos ai &. 
Pe i eal A X 
. sin} ea ¢e® “san bedien 8 
(en) 
7s Os Lot 5 
Rote ioodee- To 6 
z- uo@y 4 
: 148 
z : A iOis Bot 
513 Lv? 
if{te? 
¢ is aT 
= 
i) 
4 
Pree. a 
nob be 5 dtm 
; sot 4 eveldta’, 
§ centro © 
is¥) stisse 
- : -sini £6.3868.: 
& 
é : lLisY ses P 
Pe ae 24S ate cued L got: B0ntarce ears 367 
: wide og ~ oni oe ee pb aiaal _ 
- eR: 


with the results obtained from the test that is being 


validated. The important point to stress in connection with 
this criterion is that the test accepted as a valid measure 
of intelligence must have been adequately validated other- 
wise the obtained results will be valueless. 
6. Combinations of These 

There is no single validation method explained so far 
whose use will result in a complete validation. All suffer 
from some drawback, hence it was hoped that a combination 
of them would enable the test experts to arrive at a better 
criterion. Pintner reports that McCall and Lin each used a 
composite criterion with no little Sat ae used a 
criterion made up of teachers' judgments, Binet, group 
intelligence tests and measures of educational achievement. 
Liu used the most elaborate criterion for estimating intell- 
igence. His criterion consisted of (1) age, (2) school 
marks, (3) school progress, (4) teachers' estimates of 
intelligence, and (5) composite test scores of five group 
intelligence scales. The different elements in the criterion 
were carefully weighed. The evidence indicates that Liu has 
obtained very good results in the use of his method. Pintner 
concludes that: “From this survey of the various methods 
of determining the validity of an intelligence test, we can 


see that there is no one method that is infallible. Hach 


(aa), hid, pp. 112. 


. 5 * 
- 4 e/ , ma ee me . 
LP2 HS Vent - Pay mw Bao of Cx 
ce 


es ae ee 175t 


single method is open to objections. The psychologist must 
use as many of these methods as he can. The construction 
of a composite criterion is undoubtedly the safest ae 
B. Reliability 

It is not only important that the group psychological 
test be valid but it must also be reliable. The test must 
measure not only what it is intended to measure but it 
must also measure it accurately. As we have seen in our 
previous studies (pages 56-72 inclusive of this thesis), the 
usual measure of reliability is the coefficient of correla- 
tion between the two forms of a test. In view of the fact 
that the subject of reliability has been treated quite fully 
previously in this thesis, the present writer will not 
include any new material. It is sufficient to say that the 
methods of insuring reliability in a new-type educational 
test are identical to those used in reference to the group 
psychological examination. 

IV. Practical Values of Intelligence Tests 
A. iHducational Guidance 

The group jabeweietbad examination is a valuable 
source of help to the practical teacher in many situations. 
It is especially valuable in reference to guidance work. 
It happens quite frequently in large high schools that 


certain pupils are not scheduied correctly, and consequently, 


(26) Ibid, pp. 112. 


v 
oe 
feof 


det 


7 


4 
a 
: 

tag 
7 
> 


1758 


are put into a class of superior pupils. Many times these 
unadjusted pupils will become discipline problems because 
they have no other outlet for their energies. The accomp- 
lishment of their classmates is so superior that these 
“problem cases" are left hopelessly in the rear. Buckingham 
makes the following comment: "To place a person of low 
intelligence, whether that person is a school child or an 
adult worker, in a position which requires a higher degree 
of intelligence is to rob him of the satisfaction of success 
and to engender the habit of failure. On the other hand, 

to place a person of high intelligence in a position which 
Galls-for lower inteiligence tends to weaken the fiber of 
his moral a ean at obvious solution to these problems 
of maladjustment is more effective educational guidance. 

If unusual cases arise, the school psychologist's aid should 
be sought. 

Now, this problem of effective educational guidance is 
not so simple as it may seem. In many cases, it is diffi- 
cult or almost impossible for the class-room teacher to 
judge correctly the intelligence of a certain pupil. It 
is readily seen that the teacher's impressions will not 
insure a correct analysis as to the pupil's mental level. 

In many cases, the teacher's impressions are biased by the 


appearance of the pupil. Good clothes and external 


= 6S oe ow ee ee oe Se SS OF oe oe SD De ee ee ee oe ee ee ee ee ee ee se ce Se ee ee ee ee ee Se me ee ee ee ee ee ee ee 


(27) Buckingham, Burdette Ross, Research for Teachers, 
pp. 162, Silver, Burdett and Co., 1926. 


rf P 


omia te 7 
. x 


i 
a 


Fee OF 1 E00 ag 


: 7 —t,- 
J . ‘ a 7, 
{ - ' PA ex T-. -— a 
: eioset ad genes nT 
f ~ - we , 
= . yes 


: . ph te Sy a 4 

; | »iiqng eit=re 6oneteequs 

My - - ; ~* : + 2s ap ee we we pees rey CR ee Ft — 
é ¥ ‘es 


iia 


i 
4 = j , 
at hee 
. * rf ~ « 


Cree, ee «hal ol a cee oie GARen 


175v 


appearances of health are often deciding factors in the 
teacher's estimate. Many times the teacher is deluded 
because of a sprightly attitude on the part of the pupil. 
Adams and Taylor say, "For the most part mental measure- 
ments have been of great value in helping us to detect 
more accurately the level of intelligence of individuals. 
They furnish a standard to grade by, which in itself is of 
material er “a 
B. Sectioning of Classes 
Another important use made of the group mental test 
is in the sectioning of classes. Many high schools and 
colleges section oh aie Glasses according to ability with 
the intent of varying the instruction offered in order to 
mest the intellectual level of the particular classes. 
As this topic has been treated in detail elsewhere in this 
study ( pages, 168-175 inclusive), it is not necessary to 
take up any additional material. 
C. Surveys 
The group psychological examination has been very 
important in the making of educational surveys. ‘This makes 
it possible to make comparisons between communities. 
D. Vocational Guidance Values 
Mental tests have value for predictive purposes in both 


the vocational and the educational fields. They are of 


o 
= SS Se ee ee ee ee ee re ee ee ee ee ee eee ee ee SE eee ee Se ee oe 


(28) Adams, Jesse E., and Taylor, William S., An Intro- 
duction to Education and the Teaching Process, 
pp. 176, The MacMillan Co., 1952. 


2 ¢ 


Aes we 
we # ~~ P+ weer oe ¢ te a4 . 
ms Lal AG feel Lotnentes. 3 4 

AR 

ns ¥i.sanitee 
« t . . ‘ 
3 yas sooty 
i 4 fe 


“43 i 
*~ ; 
~ 
— ‘ 
- 
I | 


e e -* 
i G 
= 
rf jh 
i “hl 
‘ ip OJ 
itz hs tATTOuL 7 
L Stakes 
> ¥ 
: . . on 
ri & iv Daw fuse iseooe pede 
‘as. tee 
“re a pen eidn splay <A : 


ee 
- oanel |, Ste bh 


tras LO. 2 ) 

y ae) 

osaf oat tae noltpomhe rarcyey usb 
i s 


5) 
»Sa2] se 8U mei ti} Hoss oAt ,OPL ef 
' Rates ot OS 


great assistance to the employer in selecting employees, 


and also to the individual, since the latter uses them to 
enable him to find out the vocation for which he is best 
suited. ‘The results from mental tests are invaluable to 
the vocational counselor in enabling him to help advise 
pupils as to what type of course to take in high school. 
It follows logically that a pupil of low intelligence who 
elects the technical course which prepares for the college 
of engineering should be advised to change his course 
because of the limited possibilities of his succeeding. 
Ragsdale holds that: “By using intelligence test scores 
in connection with other information which has been obtain- 
ed about a pupil, it has been found possible to predict 
fairly well the course of his future scholastic dicted te 
Wheeler and Perkins say, “Tests are useful in high school 
in giving vocational advice, in helping the student select 
his course, and in determining the rate at which the student 
should attempt to yimerees weight of authority tends 
toward the conclusion that the psychologists by their 
development of the group peccbayociea? examination have 
performed a signal service to the schools of America. 
Ve. Group Intelligence Tests Suitable for Use in Secondary 
Schools 
At the beginning of this section the present writer 


(29) Ragsdale, Clarence E., Modern Psychologies and 
Education, pp. 227, The Macliillan Co., 1932. 


(30) Wheeler, Raymond Holder, and Perkins, Francis Theodore, 
Principles of Mental Development, pp. 190, Thomas Y. 
Crowell Co., 1932. 


e 
. 


+>, = ty 
a Cie ~! ni beens" 


x 


io. Roles ed. 


* 
aT oa 
A 2 


; - 
- ee - 
‘ - 
; rc 
+ 4 7 Wis tet 
. : - ? 4 w 
a 
7 
. } Fs < 


a ee eS UL 


wishes to call attention to the impossibility of describing 
and explaining all of the group psychological tests that 
have been published. It is undoubtedly true that some group 
tests of considerable value are not very well known because 
no effort has been made to publish them. Many school 
systems and colieges have developed their own psychological 
examinations, yet in spite of the fact that these examina- 
tions possess great value, they are merely used in the 
particular school system or institution where es were 
developed. The method that will be employed in this section 
will be to give a short description of some of the most 
commonly used and readily available group tests. 

Professors Douglass and Boardman in their latest book 
eedommena the following group psychological tests for high 
school use: l, Haggerty Intelligence itxamination, Delta 2, 
2, Miller Mental Ability Test, 5. Otis Self-Adminstering 
Test of Mental Ability, Higher Examination, 4. Pressey 
Cross-Out Test, and 5. Terman Group Test of Mental ee 
It is readily apparent that some of these tests have received 
Wider publicity than others. The present writer proposes to 
discuss these tests briefly and then to supplement this by 
a short consideration of other group intelligence tests 


that are well known. 


ae ee ee ee ee Se Se Oe Se ee Oe Se ae Se SS Se Oe Ge SE Ge 6S EE ee ae Se Se Ge oe ee oe Se ee Se ee ee ee Se Se ee ee Se SS oe ES ee ee ee oe 


(31) Douglass, Harl L., and Boardman, Charles W., 
Supervision in Secondary Schools, pp. 5535, 
Houghton-Mifflin Co., 1934. 


ce 
hawt FP 
»NRPOULS. £20. ‘ota 


P. py 
- = & —— a -- - — oO Ae ae ae ah are ae eee ee -_—- 
ay 


a 
oe sat: 7 , 4 Ieee aeetas 
} . : § : ic d g i ote Li 


Haggerty Intelligence Examination, Delta 2. 

This test consists of six exercises as follows: 
1. Discrimination between true and false state- 
ments; 2. Arithmetical problems; 3. Picture 
Gompletion; 4. Discrimination between words, 
whether same or opposite; 5. Common sense 
judgments; and 6. General information. Published 
by World Book Co., Yonkers, New York. This test 
is suitable for grades 3-9. No reliability 
coefficient is reported, although the author re- 
ports that Stenquist found an (r) of .81 on this 
test after testing five hundred children from 
grades 4-8 inclusive. The estimated reliability 
for a single grade is .6. The test comes in one 
form. Time required, 20 minutes. Author, M. E. 
Haggerty. 


Miller Mental Ability Test 

Consists of three tests: ll. Disarranged sentences 
combined with directions; 2. Controlled associa- 
tion; 3. Analogies. Suitable for grades 7-12. 

The test comes in two forms. Published by the 
World Book Company. Time required is 20 minutes. 
The reliability coefficient for re-testing 109 
pupils in Grade 10 is .91. The standard deviation 
is reported as 14.3. Author, W. 5. Miller. 


Otis Self-Administering Test of Mental Ability 
This examination is arranged for two levels, viz., 
the Intermediate Examination for Grades IV to IX 
and the Advanced Hxamination for High Schools 

and Colleges. It is a very easy test to administer 
as the subject merely reads over directions on the 
first page of the test, and these directions give 
samples of all the different kinds of items which 
appear in the test proper. The examination for 
each level is furnished in two forms - Form A 

and Form Be. Neither examination is divided into 
Sub-tests, but different types of items appear 
mixed up throughout the test, beginning with easy 
items and proceeding to more difficult ones. 

There are 75 items in each examination. The re- 
liability for the Intermediate Examination for 
Grades IV to IX is .95 and for the Advanced 
Examination for Grades VII to XII is .92. The 
reliability coefficient for the Advanced Examina- 
tion is based on a sample group of 2535 pupils 

from grades 7 to 12. The standard deviation is 
reported as 13.82. Author, A. S. Otis. 


avs. 


wilt neo tgs se ta 


' « ¥ 
: Pees yeh 0.3 i #9; -§ At 
ooW Voswhed neAtee clati swale veo) 
Lf .%& pasdsicowm iseltemigess: a ¢ednan 
® noOONsed . -TeositL «2 “tnebse 


ae 

+e € 20 Were F) . 
a. ae 

: colo Se. Sees me 


g 
r » . 
« : a . bie ri 
- auf d, ~e 
s - ee a “ © 4 
. - a: 
th Se 5 » DLOESS 
4 * 
& 6 “ 
3 g ; ) aoa CA 
> 
. ~ «A 
? . ~~  & 
‘ a < s Fev 48 
5 = 
* ‘ a o “— 
. . 7 
> r. poe ee | 
. . - . —_—h & 
get ero- 
= . 
Pe 
° 
| ° 
a’ 
E * 
- i a 
‘ ‘ 
: ‘ ' . 
. = 
e * 
UJ ; 
Se ae & 
» 
: > 
< > © 
ig: 
7 not L. 
. 3 
~ * 
e 
; 
4 
: 4 
e 7 
e 
F = 
‘ “ . 
- « ae" 
y 2 ra 7, 
. eS = 7 ? 1 
7 vie ie a ~* iLidatt 
z : J c+ . ~- 
‘ ? = ~ * 
y 3 So: be t 6s: zt. 


2; Y Ti pi rod OL Onan 

2205 . ie Be ‘etait toaes ~iLitdat ier ~ 
A ' a , 5 { ties ’ ak: need ws 

. oe ShaTE hy ae 


4. 


Pressey Cross-Out Test ; 

There are four exercises in this test, each call- 
ing for the same type of response, viz., crossing 
out something. The test is useful from Grade III 
to High School. In test one, the subject is 
Galled upon to cross out the superfluous word in 
disarranged sentences; in test two, the super- 
fluous word in lists of words related to each 
other; in test three, the superfluous number in 
@ number series; and in test four, a moral judg- 
ment test in which the worst thing in the list 

is to be crossed out. There are excellent norms 
for these tests for ages 10 to 17, and for Grades 
III to XII. No information relative to validity 
and reliability is available. 


Terman Group Test of Nental Ability 

This test is one of the most frequently used 
tests for high school purposes. It is suitable 
for grades 7 to 12 inclusive. Age and grade 
norms are available. The examination consists 
of ten parts, viz: 1. Information; 2. Best 
answer; 3. Word meaning; 4. Logical selection; 
5. Arithmetical problems; 6. Sentence meaning; 
7. Analogies; 8. Mixed sentences; 9. Classifica- 
tion; 10. Number series. The reliability for 
132 cases in Grade IX is .89. This test comes 
in two forms. Published by World Book Co. The 
standard deviation is reported as 24.2. ‘Time, 
27 minutes. Author, L. Mi. Terman. 


Dearborn Intelligence Scale, Series II. 

Adapted to Grades IV to XII. It consists of two 
examinations containing the following tests: 

1. Picture sequences; 2. Word sequences; 3. Form 
completion; 4. Opposite completion; 5. Faulty 
pictures; 6. Disarranged proverbs; 7. Number 
problems. Norms for ages 6 to 20, and for Grades 
II to XII are given. Published by Lippincott Co. 
Author, Walter F. Dearborn. No information is 
available as regards validity and reliability. 


The Otis Group Intelligence Test . 

This group test applies to Grades V to XII. It 
is divided into ten parts, viz.: 1. Following 
printed directions; 2. Opposites; 3. Disarranged 
sentences; 4. Matching proverbs; 5. Arithmetic; 
6. Geometric figures; 7. Analogies; 8. Similari- 
ties; 9. Narrative completion; and 10. Memory. 


9 


175a-1 


The reliability coefficient for Grades IV to 
VIII is given as .967. Author, A. S. Otis. 


Detroit Advanced Intelligence Test 

This test is designed for high school and college 
use. It consists of the following parts: l. 
Information; 2. Opposites; 3. Classification; 

4. Number sequence; 5. Block designs; 6. Spelling; 
7. Analogies; 8. Mixed-Up sentences. No infor- 
mation is available in regard to validity and 
reliability. Norms are given for ages 9 to 25 and 
letter ratings for ages il to 16. 


Thurstone Psychological Examination 

This test can be used effectively for all years 
in high school or college. It contains a large 
number of problems involving analogies, number 
completion, logical reasoning, mental arithmetic, 
general information, sentence completion, 

proverb matching and the like. The items are 
arranged in a spiral arrangement, and the 
different types are thoroughly mixed up. The 
Same type of problem occurs again and again, 
beginning with the easiest examples and gradually 
becoming harder and harder. The reliability 
coefficient is reported as .959, and was ob- 
tained by working with 250 subjects. The 
reliabilities on the separate parts vary from 

71 to 298. 


A yay | Merle 6 
ci 7h re ae 
, o-™ | Saowtavls 7 hos, st .8 


y te sfe@ieron Fre ~~ 

.bas ieogad 1807 ho beatae ce 

oh fa. .@ jeciespes eae ya 
~isxil 8 {eelso fend (hs 


S wero! Yi iildalkier 
Rese . a ob ee re a 
i L encivst “evTol 


a ere i 
: enere rad << 


-. i a é 
ey s2o deot eidZ 
- ee . 
/ ; foogdoe asin gt "F 
s' rf LY 


t ’ iy rks Ttedusm  \F- 


i . Ao ttesgaso 
se; 4s voitnl {TShes >) vat 
—S ? idusves JSve7ttg ae 
, : gia TTS - 
STSETEH |b 
| uy te egal mee SG 
ifiokRed (7a 


“fy i a é 
ne ood 


> 
3 


. ‘ - ee. 
. 
- 
> 
, 
4 . 
, 


Le CONCLUSIONS 


The necessity for a well-rounded testing program has 
been frequently emphasized in this thesis. If testing in 
the social-business studies is to yield rich dividends in 
the form of improved teaching efficiency, it will be through 
the medium of a planned and well-organized testing program. 
The busy teacher with large classes is too prone to give 
tests whenever it suits his personal convenience; this may 
Or may not be the time when a test should be given if a 
definite test program is followed. Critics of the test 
program can no longer clamor that the giving and scoring 
of tests consumes too much time. Such arguments lose their 
validity when it is proved that the achievement of large 
and small classes may be determined equally as well by the 
use of the new-type test techniques. 

There are many other desirable ideas that should be 
advanced in connection with the well-rounded testing 
program. It is safe to conclude that a testing program 
is rarely worth the time, effort, and money expended if it 
does not result in some worth-while modification in class- 
room procedures and practices, materials and methods of 
instruction, class and school organization and management, 
Or some other phase of school work. A testing program 


that does not measure up to the accepted standards should 


ote 4 y oc si -geldnee Qeeatend=teboom 
F 7 ; 

ww ¥ 2 = f x : 

clits galdebed “‘bovyougmt So axes 


‘ oe 
; : es jae 
* Z i = i Pe ye | SBOhAB LG S210 a q 
: oa 


P i> + » 


m bee iit a Lee ra laser 
q. 2 tinge SL ceves 
Gis 
we To oe ¢ 
. re \) .0 . Gu 
¢ Ca y al. 
° - J4 it ras 
j P : Lit OoOF BORGES 
» L a 
° n 
ih nea 
Ps r ° a? 
e ¢ Peay © Ee@if 
\ . . J Ge 6g ra eee 
~ 


~" , 
. r . : tad 
» > . ~ > oO / ~¥ 4 . 4 
- b 4 — , 
; : : Re vies 


73t eiq foe eewhesorg a 


: ral <4 toe Sas pesto RoLtags 


eXsGz7 foctiog ‘to his upaie: omc » He 


4!) 


a eee Se ow ' 177 


be immediately challenged. In such a situation, the 
Commercial Department Head should present the problem to 
his teachers and enlist their aid and advise in effecting 
the necessary revision. Such a step is based on sound 
psychology as it attains two worth-while ends: the teachers 
will support more readily a test program that they, them- 
selves, have formulated, and, secondly, the new test 
program will be more effective after the elimination of 
Obvious weaknesses. 

The following specific conclusions are maintained in 
this thesis: 
1. AS commercial teachers experience difficulty with the 
testing problem in the social-business subjects, this field 
offers abundant opportunity for research work. The field 
has not been covered adequately even as yet, although great 
strides have been taken. 
2 The test concept is very ancient. Tests and examinations 
of various kinds were in use hundreds and even thousands of 
years ago among such people as the Chinese, the Greeks, and 
the Romans. 
5. A testing program should consist of both standardized and 
non-standardized tests. In most cases, the number of the 
latter should exceed that of the former, their respective 


proportions depending partly on how satisfactory is the supply 


of standard tants available in the aa aah being dealt with. 
4. In its present state of development the new-type test 
is best suited to test the acquisition of information. 

5. It is in the use of tests that the greatest hope for 
scientific guidance lies. 

6s Although educational measurements are of great value to 
the supervisor and administrator, their most valuable 
contributions have deen made in the improvement of teaching. 
7. The testing program must provide for comparable tests. 
‘Test results should be marked on cumulative records made 
available for reference purposes to the entire staff. 

8. Standards, as well as norms, should be provided for 
tests. 

9. The testing program must not be limited to one standard 
test administered late in the year. To do this defeats the 
Objectives of the entire testing program. 

10. Provision must be made for continuous systematic test- 
ing of pupils at regular intervals during the year. The 
further instruction of the pupils should be adapted to 
their needs as shown by a study of the test results. 

ll. The giving of tests in itself is of little value, but 
Should be followed by diagnosis and remedial measures. 

12. Supervision of testing is as important as supervision 


of teaching. It devolves upon the Commercial Department Head 


at teatot oeoane 


eS F, 
2 wey 


¢ ‘ 
it. CRF SISp,, ta. ean ene a 6. 


ie aes ae 

abl r eongbins edt, 18% 

. 9 | a . c — : Vie 

- ‘ r al = 5 . 

Dersissegm Isniols 2608 55 flae Ox 
) : 3S 

Elivieg 


- ‘ = 
-* 
yy ' 
. os 
v 
> “ 
‘ 
= -- 
. = 
° sn 
+ rs os 
on 
ke, ‘ 
i -/ - 
{ie | t . 
vw 4 we A 
i 
“= 
» 
7 
. fw 
J 4 
yt % 
” 
P ’ 


to examine the tests his teachers are using, and to assist 


them in selecting or planning suitable ones. 

13. It is desirable that the specifications of a marking 
system should grow out of a cooperative study of the problem 
by all the teachers concerned rather than that they be 
handed down by the executives in the system. 

14. School marks should be based solely on the result of 
tests as far as possible. 

15. From the evidence, it appears that relative marking 
systems are more justifiable than the absolute types. 

16. The marking system adopted in a school must be adhered 
to by all members of the faculty. A teacher who constantly 
deviates from the accepted rules is being unfair to the 
entire staff. 

17. The problem of individual differences can be met at 
least partially by ability grouping. 

18. Pupil classification should be neither haphazard nor 
arbitrary. 

19. There is no royal road to homogeneous sectioning. All 
the evidence that is obtainable must be considered carefully; 
in addition, provision must be made for the constant shifting 
Sede One section to another whenever instances of maladjust- 
Ment occur. | 


20. Every good-sized secondary school should, if possible, 


tol ‘tiosem sds-Jane t oleate 


tiJg‘teqces al to Tae Bie 
hy 


, ao qielom S@eadhed  blamdiet eneramy: w Loon an 


. 
¢ ° % 4p 7 aS 
; G600R ont mOTT 

. 


ql Lieder 


fasfivital: te oe sider eat, 


~ ~ t } ¢ 
> 4 i A 4 
% Leé * J ,er #BBi0 
$ - 


7 ’ : - » - > 
5 x t s3f 3 i342 Gt Gi J Bhs - SOF 


: ’ . “i 
au : of re ea ata? . pe. 
co.. sbum ec saeum nolepvorg BOF, 


vs , 
’ 
> , 
# 
* i 
2 iy 
i, “ 


fw 2 


qige YIs! 


have a testing bureau. This bureau should be a place 


where the teacher can go for desired assistance in the 
construction of the various types of modern tests. 

The principles underlying a well-rounded testing 
program have now been explained. ‘The present writer feels 
Gonfident that a testing program such as he advocates 
would go far toward odtaining the desired results. Whether 
progress is made depends to a large extent upon the 
individual teacher. There are just two ways to meet the 
Situation. One is a policy of inaction and stagnation; the 
other, a policy of action. The hope of the teaching 


profession lies in the latter. 


wh BAN! acl oF 


& 


—.— 
( 
i 


hl Aa 


f 
) 


ue suotiar edt: to 


ia 


GENERAL BIBLIOGRAPHY 


A. Books, Monographs and Bulletins 


1. Alberty, H. R., and Thayer, V. T., SUPERVISION IN THE 
SECONDARY SCHOOL, D. C. Heath & Co., New York City, 1951. 


2. Baker, Harry J., CHARACTERISTIC DIFFERENCES IN BRIGHT 
AND DULL PUPILS (An Interpretation of Mental Differences, 
with Special Reference to Teaching Procedures), Public 
School Publishing Co., Bloomington, Illinois, 1927. 


Se Brewer, John M., EDUCATION AS GUIDANCE, The MacMillan 
Co., New York City, 1932. 


4. Broady, Knute 0., SCHOOL PROVISION FOR INDIVIDUAL 
DIFFERENCES, Bureau of Publications, Teachers College, 
Columbia University, New York City, 19350. 


5. Brueckner, Leo Je, and Melby, Ernest 0O., DIAGNOSTIC 
AND REMEDIAL TEACHING, Houghton-Mifflin Co., Boston, 
Massachusetts, 1931. 


6. Caldwell, Otis W., and Courtis, Stuart A., THEN AND NOW 
IN EDUCATION 1845:1925, World Book Co., Yonkers-on- 
Hudson, New York, 1924. 


7. Cubberley, Ellwood P., PUBLIC HDUCATION IN THE UNITED 
STATES, Houghton-Mifflin Co., Boston, Massachusetts, 1934. 


8. Cubberley, Ellwood P., PUBLIC SCHOOL ADMINISTRATION, 
Houghton-Mifflin Co., Boston, Massachusetts, 1929. 


9. Dawson, Edgar and Others, THACHING THE SOCIAL STUDIES, 
The MacMillan Co., New York City, 1928. 


10. Dolch, Edward, William, THE PSYCHOLOGY AND TEACHING OF 
READING, Ginn & Co., Boston, Massachusetts, 1931. 


ll. REVIEW OF EDUCATIONAL RESEARCH, Vol. III., Feb., 19335, 
Number 1, EDUCATIONAL TESTS AND THEIR USES. 


12. Ellis, Robert 5., STANDARDIZING TEACHERS' EXAMINATIONS 
AND THE DISTRIBUTION OF CLASS MARKS, Public School 
Publishing Co., Bloomington, Illinois, 1927. 


iy . 
vs ssAsn6tda aie as 


eee a oeieaeemnienaien 


tr) 
¥e.7 edge’ 7% i 


- PSUs hs i & 5 O02 GLATSEA-nO7 aes oH BEATE 7 Mis, 


- * - -* - a a * te Ln © 
i, Hoag TENS aay o > oa 
7 * ne OF E : eee 7 
- 
7 -— ~ oer arr ~ » * « 
¥ Pa Be ee. To "a ; "a tao RS 
‘ o O22 PAlda web 86g ce eee 1! ESR OG , 
cr r P 
i J { t) . ¥4 2 
* ~ = . — > ~-« oevrw* bee em @ 


> 


: = 
a x rt 
vermeil tre: hugh + aka Ane aniis - = ead 
ee . HMw ORS'D . PUPRIGRAIUATS iy sustox - ai 
a o a ’ - << Sayre! 4 at a ti wy 7 nr fort 
: Yr 4 333 rh ;~ Si Fra Lt 4d, ees oh i" . TOT TuSiIs PETE ah Ts he 
K ee r } wires a. ges §: en de it 
: 3 SLGC Es t sortanim@oole | .6f 4 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


2l. 


2260 


25-6 


24. 


GENERAL BIBLIOGRAPHY 


Garrett, Henry E., STATISTICS IN PSYCHOLOGY AND 
EDUCATION, Longmans, Green and Co., New York, 1926 


Gibbons, A. N., TESTS IN THE SOCIAL STUDIES. A 
RECORD OF A TESTING EXPERIENCE IN THE SENIOR HIGH 
SCHOOL SOCIAL STUDIES, National Council for Social 
Studies, Iowa City, Iowa, 1929. 


Hildreth, Gertrude H., Ph.D., PSYCHOLOGICAL SERVICE 
FOR SCHOOL PROBLEMS, World Book Co., Yonkers-on- 
Hudson, New York, 19350. 


-Hull, Clark L., APTITUDE TESTING, World Book Co., 


Yonkers-on-Hudson, New York, 1927. 


Johnson, Franklin W., ADMINISTRATION AND SUPERVISION 
OF THE HIGH SCHOOL, Ginn & Co., Boston, Massachusetts, 1925. 


Kelley, Truman L., INTERPRETATIONS OF EDUCATIONAL 
MEASUREMENTS, World Book Co., Yonkers-on-Hudson, 
New York, 1927. 


Kimmel, William Glenn, THE MANAGEMENT OF THE RHADING 


PROGRAM IN THE SOCIAL STUDIES, Publications of the 


National Council for the Social Studies, Number 4, 
MeKinley Publishing Co., Octobder, 1929. 


Kitson, Harry-D., General iditor, COMMERCIAL EDUCATION 
IN SECCNDARY SCHOOLS, Ginn & Co., Boston, Massachusetts, 
1929. 


Koos, Leonard V., and Kefanver, Grayson N., GUIDANCE 
IN SECONDARY SCHOOLS, The MacMillan Co., New York City, 1932. 


Kyte, George ©., PROBLEMS IN SCHOOL SUPERVISION, 
Houghton-Mifflin Co., Boston, Massachusetts, 1931. 


Lang, Albert R., MODERN METHODS IN WRITTEN EXAMINATIONS, 
Houghton-Mifflin Co., Boston, Massachusetts, 1930. 


Lomax, Paul S., COMMERCIAL TEACHING PROBLEMS, Prentice- 
Hall -Inc., New York City, 1928. 


Le 


My catstie t CA ne 
© +9 ad fist ei gti 


van 


bs ortaed rderas 4 
8 fQEF Lon ] oe 
OOOL: RIOT, ae 


Hee 
- T igre 
, er  akk 18100 


ot AO S50B-a0-en0i 
2 


. rr 
ws 
< ceoy 
. * a a. Bie Aas aii ¢ 
; 
rn ~~ an ote 
aa AWE ati = | 


si . wh Deiat 148. 


i bined re tsst 


«sao rf 


‘) a? fy 

3 ys LA hAAT 
Pe tt ’ 
ii0GE SHT 
paw t Pxieeri et 


fifti 2 
4 > dhe we 


* f 


rs we: 7 + a ey 
tat 7 Or Pe 
ZU BD > «tk - > > be see? r ag 
ah > me a r wer PTT . wor. > 

iy Aka ed RACH OGRE BAN 
7 ifs ft 
g 7) 

r in 7 
& .* T , BoCo8u ae 
~* > ~ rae 
s * ee ide —w Wee 


.* w + ‘Fi ” slit ne 


f2 ig >» 
See ¢ * ay 


oo OS Salat 


25-6 


26. 


276 


28. 


296 


506 


Oke 


Dd 


546- 


35. 


36. 


GENERAL BIBLIOGRAPHY 


Lomax, Paul S., and Agnew, Peter L., PROBLEMS OF 
THACHING BOOKKEEPING, Prentice-Hall, Inc., New York 
City, 1950. 


Lomax, Paul S., and Haynes, Benjamin R., PROBLEMS 
OF TEACHING ELEMENTARY BUSINESS TRAINING, Prentice- 
Hall, Inc., New York City, 1930. 


Lomax, Paul S., and Tonne, Herbert A., PROBLEMS OF 
TEACHING ECONOMICS, Prentice-Hall, Inc., New York 
City, 1932. 


TESTS IN COMMERCIAL EDUCATION, An Annotated List 
Compiled by J. 0. Malott and David Segal, Circular 
Number 56, Washington, D. C., November, 1932, 
Department of the Interior. 


McCall, William A., HOW TO MEASURE IN EDUCATION, The 
MacMillan Co., New York City, 1922. 


Monroe, Walter S., DIRECTING LEARNING IN THE HIGH 
SCHOOL, Doubleday, Doran & Co., Inc., New York City, 
1927. 


Morrison, Henry C., THE PRACTICH OF TEACHING IN THE 
SECONDARY SCHOOL, The University of Chicago Press, 
2nd Hdition, Chicago, 1930. 


Nichols, Frederick G., COMMERCIAL. EDUCATION IN THE 
HIGH SCHOOL, D. Appleton-Century Co., New York City, 
1933. 


Odell, C. W., EDUCATIONAL MEASUREMENT IN HIGH SCHOOL, 
The Century Co., New York City, 1930. 


Orleans, J. 5., and Sealy, G. A., OBJECTIVE TESTS, 
World Book Co., Yonkers-on-Hudson, New York, 1926. 


Otis, Arthur S., STATISTPICAL METHOD IN EDUCATIONAL 
MEASUREMENT, World Book Co., Yonkers-on-Hudson, New 
York, 1926. 


Ruch, G. M., THE OBJECTIVE OR NEW-TYPE EXAMINATION, 
Scott, Foresman and Co., Chicago, 1929. 


S7. 


58. 


39-6 


40. 
4l. 
42. 
45 

44. 
45. 
466 
47. 


48. 


GENERAL BIBLIOGRAPHY 


Ruch, G M., and Rice, G. A., SPECIMEN OBJECTIVE 
EXAMINATIONS, Scott, Foresman and Co., Chicago, 19350. 


Ruch, G M., and Stoddard, George D., TESTS AND 
MEASUREMENTS IN HIGH SCHOOL INSTRUCTION, World Book 
Co., Yonkers-on-Hudson, New York, 1927. 


Russell, Charles, STANDARD TESTS, Ginn & Co., Boston, 
Massachusetts, 1930. 


Smith, Gale, HOW TO CONSTRUCT AND USE NON-STANDARDIZED 
OBJECTIVE TESTS, The Benton Review Shop, Fowler, 
Indiana, 1929. 


Stormzand, Martin J., PROGRESSIVE METHODS OF TEACHING, 
Houghton-Mifflin Co., Boston, Massachusetts, 1927. 


Symonds, Percival M., MSASUREMENT IN SECONDARY 
EDUCATION, The MacMillan Co., New York City, 1930. 
(Associate Professor of Education, Teachers College, 
Columbia University). 


Tiegs, Ernest W., TESTS AND MEASUREMENTS FOR TEACHERS, 
Houghton-Mifflin Co., Boston, Massachusetts, 1931. 


Tonne, Herbert A., Ph.D., and Tonne, Henriette M., 
SOCIAL-BUSINESS EDUCATION IN THE SECONDARY SCHOOLS, 


New York University Press Book Store, New York City, 1932. 


Van Wagenen, M. J., EDUCATIONAL DIAGNOSIS AND THE MEASURE- 


MENT OF SCHOOL ACHIEVEMENT, @he MacMillan Co., New 
York City, 1926. 


Walters, R. G., MODERN METHODS OF TEACHING COMMERCIAL 
SUBJECTS, Monograph Number 16, South-Western Publishing 
Co., New York City, 1932. 


Weidemann, Charles C., HOW TO CONSTRUCT THE TRUE-FALSE 
EXAMINATION, Bureau of Publications, Teachers College, 
Columbia University, New York City, 1926. 


Wood, Ben D., MEASUREMENT IN HIGHER EDUCATION, World 
Book Co., Yonkers-on-Hudson, New York, 1923. 


plore 


wot te eo = 7 a a 
¢e-ws tid =< moe 4 re = 
“/e i — 
a! 2 eee Oia, . tit 
- aT eTey ort 4 > 
** 7 . r i? Ath de co 
» 
- - or 
' A . - 
al ’ = * ia € 2 
‘ , S 5 
~ *¢- . * 4 * . i °Z - iat 4 = by a 
’ ae ok Ld 
Pee Iers: »+> bbdteal , bdstete 


yee PS \- Bae el US ee Pe | ¢ OG '* Gv Gs reese {-potd Lol 
ieee 


3 eviorve 6 
° -? -~ i Aw 
r 7%, ein 
+ . . . aS) a 2. 
a? - > = - Y "| i. 
: os 0 %ebe@eTtots 
aad & 
Toy 
ry>% % A 7 
a 
mere be 
is 4 a 2 ev ~ 
- . oO grt 
- ~ ew ys a 
4 


give : | er } . wey 6 CRG. ae f we igusk 
Si : : ‘ td + AWG Ban es 


Ye 5. LOS ee _nocepat 
» OU cemete De ane LOK LOCSOR: t 


. : ' ‘ 
Ky én tae » oh af 
S: ages & E oF ae | Suow t 
f ae Atot © 
‘tae 
- ey PR : — ~ x 
Se ae i ou : > i oe gc fad 0 7 Anee 
-4055T— Smoite LenS 24 one TS Ake 
ve fe. e i " . 
sSSBOo VIL ATG. tel. . Stater FA 
5 rm 
‘e ia 
wa - wv + 7 r j e coer ™ 7 
g te A . @ bn trang | y 3: Bl eS ety ey 0 CARE ¢ Z. 
-6o & a te . 
FAP er: a 


GENHRAL BIBLIOGRAPHY 


49. Wheeler, Raymond Holder, and Perkins, Francis Theodore, 
PRINCIPLES OF MENTAL DEVELOPMENT, Thomas Y. Crowell Co., 
New York City, 19352. 


50. Buckingham, Burdette Ross, RESEARCH FOR THACHERS, Silver 
Burdett & Co., New York City, 1926. 


51. Adams, Jesse E., and Taylor, William 5., AN INTRODUCTION 
TO EDUCATION AND THE TEACHING PROCESS, The MacMillan 
Co., New York City, 1932. 


52. Kelley, Truman Lee, INTERPRETATION OF EDUCATIONAL 
MEASUREMENTS, World Book Co., Yonkers-on-Hudson, New 
¥ork, 1927. 


. 53. Pintner, Rudolph, INTELLIGENCE TESTING, Henry Holt and 
Co., New York City, 1931. 


54. Ragsdale, Clarence E., MODERN PSYCHOLOGIES AND EDUCATION, 
The MacMillan Co., New York City, 1952. 


55. Monroe, Walter S., -DeVoss, James C., and Reagan, George 
W., EDUCATIONAL PSYCHOLOGY, Doubleday, Doran and Co., 
New York City, 1930. 


56- Dearborn, Walter Fenno, INTELLIGENCE TESTS, Houghton- 
Mifflin Co., Boston, Massachusetts, 1928. 


57. Douglass, Harl R., and Boardman, Charles W., SUPERVIS- 
ION IN SECONDARY SCHOOL, Houghton-Mifflin Co., Boston, 
Massachusetts, 19354. 


58. Levine, Albert J., and Marks, Louis, TESTING INTELLI- 
GENCE AND ACHIEVEMENT, The MacMillan Co., New York 
City, 1928. 


B. Magazine Articles 


59. Blackhurst, J. Herbert, DO WE MEASURE IN EDUCATION, 
Journal of Hducational Research, December 1933. 


60. Brueckner, Leo J., THE VALIDITY AND RELIABILITY OF 
EDUCATIONAL DIAGNOSIS, Journal of Educational Re- 
search, September, 1953. 


owe 


= 
a 

~~ 
« £4 


oon tehg 


2 2 
Qe oe Ti mee ay 


PROF 


{ fy op ST 
P gab 4S er Pe: ¢ 
; ro | YY alee 
~ J "+ 210 
- fi 
’ 
3 ot he 
peer, Be SBg Pet 
_ ads. acdc f 
yi. ev oA Me 
Pa q ~_ > 
Oye she sad is 
5 ; Sea ‘ wets 
ee “a sm 
f 5 fF 
aVieke «Ue i 
. % ‘ ~~ rr 
t thes? = ro? Leh 
- _* Vy 
cr i an 
~ -°* » Adad 
- ~ 4 **, iy 
[. a a ww 
wv 8M ,e@ee SSN 08 a mete 
‘> 7 
a & ,*¥ v« 
, so 4 \ Sp 
‘ mn he 
7 al ol? 
a 
’ ri ‘ 
~~ , oS 2 APRS , ie J 
ha 
” - . 
AD ee Wh #Q De 
oa 
, ~ r 2 
ti j t 
es Joy ; ra eae 
+ Ao a! 
ut! 


6l. 


62.6 


65-6 


64.6 


656 


66. 


67. 


68. 


69-6 


GENERAL BIBLIOGRAPHY 


Crooks, A. Duryee, MARKS AND MARKING SYSTEMS: A 
DIGEST, The Journal of Hducational Research, 
December, 1955. 


Gerberich, J. R., A TECHNIQUE FOR MEASURING THE ABILITY 
TO EVALUATE OBJECTIVE TEST ITEMS, Journal of Educational 
Research, September, 1933. 


Jeep, H. A., MUST OBJECTIVE TESTS BE DOGMATIC, Education- 


al Administration & Supervision, March, 1933. 


Lee, Je Murray, and Symonds, Percival M., NEW-TYPE OR 
OBJECTIVE TESTS: A SUMMARY OF RECENT INVESTIGATIONS, 
The Journal of Educational Psychology; January, 1933. 


Osborn, Worth J., THST THINKING, Journal of Educational 
Research, February, 1934. 


Perry, Paul W., HOW STUDENTS STUDY FOR THREE TYPES OF 
OBJECTIVE TESTS, Journal of Educational Research, 
January, 1934. 


Sims, Verner, Martin, IMPROVING THE MEASURING QUALITIES 
OF AN ESSAY EXAMINATION, Journal of Educational Re- 
search, September, 1933. 


Uhl, Willis L., SOME NEGLECTED ASPECTS OF EDUCATIONAL 
MEASUREMENT, The Journal of Educational Research, 
December, 1933. 


Wrightstone, J. Wayne, AN INSTRUMENT FOR MHASURING 
GROUP DISCUSSION AND PLANNING, Journal of Educational 
Research, May, 1934. 


a As be. we 
eo 


<a9p0L Jee 


¥ 


— 


uPA ee Wa. . 
ry eta in cue NCC o1 


SEES Bites LE A A TNS a Mt eae WENT N ones : 
’ 


« 
“ * 
‘ s 
‘ ~ - z | 
Fi ow | 
»-* ", 
‘ . 3 
; o 
, 4 : 
. ee 
. Sad « “ : , 
ew J a | 
nana - 7 ”t g 
; 4 
i ~ i - 
‘ . ‘ 
* > 
} . J ad 
- “om ‘ e ’ ; 
s > “he ; | . 
. ‘ “ ; r 
é ”e y a 
- a : 
i 
- ; ” 
“ he Yi 
. ~~ 4 7 ; 
. 


esas og de Spt 
ia detomnerresrsoperaen 


Liaee 
coe 
ripe we 


